The asm is really extended, with use of the DMA and a lot of the PPU is implemented. However it does not work wel, I will try those debuggers. But enough for now.
Feel free to look at the asm files and memap.txt and comment the design choices.
I will recompile the sources on Sunday, but today I am struggling with too much shit.
I need to improve the sprite zero emulation and I post today because I want advices on the replacement of indirect jumps. I have a problem with the JumpEngine routine in Super Mario Bros.
A post on my blog explains it: http://blog.vreemdelabs.com/category/upernes/
I am trying to find an automatic way to convert such code. And I want to discuss it with you to find a good solution.
Case 1: using the BRQ instruction.
Therefore the original routine addresses and data could be kept in order to handle any indirect jump problem.
The replaced instructions are like: sta/stx/sty to an IO port and lda/ldy/ldx from an IO port, plus the indirect jumps.
BRK takes 2 bytes, but the sta/lda to IO regiters and jmp indirect take 3 bytes. We can use the 2 extra bytes to store the replaced opcode. Opcodes for lda ldx ldy sta stx sty jmp could be coded on 4 bits and one byte to code the IO port or the indirect jump code. The 3 bytes opcodes are replaced with BRK XX XX and therefore, the original code keeps the same size.
The interrupt vectors must then point to an area where we have nothing on the nes. From here it goes to native mode and it changes bank to execute the IO port or jump emulation code (which only checks that the jump address was known at conversion time).
The code in ram could be a native mode change plus a jsl (jump subroutine long) going to another bank.
It will be needed to handle all the emulation mode interrupts the same way because the super mario data goes up to the interrupt vectors and leaves no room for the native vectors.
Case 2: the recompiled code must be at a known addresses.
The jump engine calls must be at the same offset as in the original prg rom. His address will be indicated in the indirect jumps file and once known every call to it will be correct.
But this could not work for other games.
Case 3: a mix between the two, every routine call address is kept at his original address. The assembler must copy the emulation code in the blank spaces left over were the data was. I assume it won't be possible.
Case 1 seems to be mandatory. I do not know if it can be done from bank 1 in emulation mode because interrupts seem to be only in bank 0?
Will it be possible to boot on the bank0 and process interrupts from bank 1 or 2 while in the CPU emulation mode?
Bank 0: patched PRG ROM
Bank 1: original PRG ROM for data
Bank 2: init and IO snes code
It boots on bank zero where it goes to code installed in a data 'hole' and then to the nes init vector:
Code: Select all
free@: sei ; native mode clc xce jml SnesInit ; long jump to the other bank
Code: Select all
EndSnesInit: sec xce ; emulation mode jmp NesInitVector
And the BRK code will work on the same principle, jumping to bank 2 in native mode and going back to bank 0 for an RTS in emulation mode.
The other solution if I want to run the nes prg from another bank, is to replace the rti instructions with a custom code going back to the proper bank. I found out that all the 256 65C816 instructions are available in emulation, maybe it can help.
Having the patched prg rom in bank 1 or 2 would make interrupts easier (Reset, NMI and BRK). I prefer this solution, but I must find a clean way to 'rti' back to the original code once in bank 0.
Edit: When an interrupt occurs in emulation from a PRG bank other than bank 0, the original bank is lost. Following the Programer's manual, a long jump to the rti in the original bank will work, restoring the bank number.
And for BRK, the rti will return on the byte after the BRK opcode, hence on the last "parameter" byte. And therefore, a pull of registers and a long jump to the saved return address + 2 should do it. The tricky part is to do it while in emulation mode???
I tried this code
Code: Select all
sec xce ; Emulation jml next .ENDS .BANK 3 .ORG 0 .SECTION "Other" next:
And the "Absolute Indirect Long Addressing" jump could be used to come back from a BRK while in emulation.
This looks promising, it would allow to have the SMB1 patched PRG in bank 1 or 2 and run the emulation on bank 0.
Bank 0: init and IO snes code (executed in native mode, no interrupts in native mode)
Bank 1: patched PRG ROM (executed in emulation mode, interrupts go to the bank 0 vector routines which return to bank 1)
Bank 2: original PRG ROM for data
Maybe I should use a replacement like this: jsr stalongjump going to a data area of the code and put a jml there going to the first bank and returning to the jsr area.
Therefore: emulation code in bank one, patched PRG in bank 2, and original PRG in bank 3.
Maybe I do not need to use two PRG banks, I use one patched PRG bank with jumps to code in the snes ram instead of a BRK #Code, NOP.
The patch would replace lda $2000 with something like jsr $08A0 ($08A0 in ram). And the ram area would contain the binary codes for something like:
Code: Select all
php, pha, sta #signaturecode, jml staticRoutineAddress
And staticRoutineAddress will call the proper emulation routine given the signaturecode in A.
The return would be made with a return long prepared on the stack by pulling the return address to add the bank prior to RTL.
Each signaturecode must have his routine, therefore it will use a lot of ram like more than 200bytes. But it will fit in wram.
This binary file would be created by upernes.exe, stored in the last bank and loaded at boot.
However SMB and Balloon Fight do not disable interrupts.
Anyone has an advice on using code in ram?
This should solve the jump machine problem for SMB.
Edit: Balloon Fight works, a bit slower but I can optimise later.
Code: Select all
MoveAllSpritesOffscreen: ldy #$00 ;this routine moves all sprites off the screen .db $2c ;BIT instruction opcode MoveSpritesOffscreen: ldy #$04 ;this routine moves all but sprite 0 lda #$f8 ;off the screen
But when I disassemble the shit, my program gets confused when it finds a jump to the MoveSpritesOffscreen label becaus it is like jumping in the middle of an instruction.
I should use an alternative path to handle it. However because I patch the ROM, it is not critical.
I will assume that every label after this bit instruction is not an indirect jump or an access to an IO port. And just write a warning on the console.
I'm still an idiot when it comes to makefiles, but I gave it a shot, and this is the error I get:
Code: Select all
mingw32-make: *** No rule to make target 'init.asm', needed by 'init.o'. Stop.
If you happen to have Windows 10, I've found Bash on Windows to work quite well in general.
I use Msys2. And it is scripted a lot, the makefile is called by convert.sh and the file names are passed through the environment. The makefile takes $(ROM_NAME) as parameter.
The script makes a copy of asm file from ../asm/, then calls make and finally deletes the asm files. This is why the call to make directly does not find init.asm.
Convert.sh from the source/workdir/ directory should do all the work. You should find a 'romname'.fig in the directory. There is no need to call make after convert.sh. It is simpler like this, because otherwise I was sometimes editing the wrong files and compiling a mess from different roms (like running Balloon Fight with the CHR data of SMB1).
I am glad you follow the project Memblers, because I must add your APU emulator to it.
Now I must find a solution to update the nametable data quickly enough. I use a 4KB ram buffer to store the nametable data and I make a copy of it using DMA during VBlank. This is slow, I would prefer to update it when the IO port is accessed on the nes but it does not seem to work other than during Vblank on the snes. Plus, the ram method is clean. But I must find a way to update only what changed (even donkey kong is slow). Maybe with a 8bit bitmask per line x 30 lines x 2 Banks, 60Bytes of bitmask. And take profit of the DMA.