It is currently Thu Nov 15, 2018 12:09 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 9 posts ] 
Author Message
PostPosted: Sun Sep 09, 2018 5:17 pm 
Offline

Joined: Fri Mar 02, 2018 12:22 am
Posts: 7
Hello!

I've been working on my emulator, Nintendoish for the last 10 months or so.

Progress is going great! I've retargeted the emulator to be an iOS emulator. While there are Windows/Mac build targets, they don't have great UI and are mostly meant for ease of development. The main user experience for the emu is definitely as an iOS app.

I currently pass all of Blargg's PPU tests (including ppu_vbl_nmi) and all of his CPU tests except for the last test in cpu_interrupts_v2. I have to thank everyone on this forum for my progress. Even though I haven't posted many questions directly, searching this forum I've been able to almost always find an answer.

My main problem right now is the only way I can get the emu to pass ppu_vbl_nmi, get through Battletoads level 2, and not shake the status bar in Bart vs the Space Mutants is through dirty PPU hacks. Delay Vblank a couple cycles here, delay Sprite 0 hit a couple cycles there, etc.

Obviously that's not ideal. I want a hack free PPU if possible.

My question is this: Is it possible to get those timings right without implementing a CPU that executes cycle by cycle microcodes? Has anyone successfully written an emulator that executes a whole instruction in one cycle, runs the PPU by the timing of that instruction * 3, repeat with next instruction, and still gets through Battletoads, doesn't shake Bart's status bar, and passes ppu timing tests without PPU hacks?

If not then sounds like I probably should rewrite my CPU. Because I really do want to make it through those timings without dirty PPU hacks if possible.

For what's it worth, currently my CPU looks like this:
1. Load an instruction. (But don't execute)
2. Step the PPU as many times as the loaded instruction timing requires * 3.
3. Poll for interrupts and set them to pending.
4. Execute the loaded instruction.
5. Execute any pending interrupts.
6. Step the PPU for any additional cycles caught in execution. (Crossed pages.)
7. Goto Step 1.

Thank you!


Top
 Profile  
 
PostPosted: Sun Sep 09, 2018 6:53 pm 
Offline
Formerly Fx3
User avatar

Joined: Fri Nov 12, 2004 4:59 pm
Posts: 3155
Location: Brazil
There was a discussion here.


Top
 Profile  
 
PostPosted: Tue Sep 11, 2018 6:29 am 
Offline
User avatar

Joined: Mon Dec 29, 2014 1:46 pm
Posts: 822
Location: New York, NY
drewying wrote:
Is it possible to get those timings right without implementing a CPU that executes cycle by cycle microcodes? Has anyone successfully written an emulator that executes a whole instruction in one cycle, runs the PPU by the timing of that instruction * 3, repeat with next instruction, and still gets through Battletoads, doesn't shake Bart's status bar, and passes ppu timing tests without PPU hacks?


FCEUX operates like that. But it's full of hacks to make things work. What's worse is that you won't be able to reproduce those hacks. It took many many years of incremental tuning until FCEUX was able to play virtually all games. And that tuning was the effort of many many contributors. As an individual emulator developer, if you want to make something that can run all games, then you'll have to strive for the highest accuracy. It's the only practical path to that goal. In other words, introduce microcodes.


Top
 Profile  
 
PostPosted: Tue Sep 11, 2018 7:41 pm 
Offline
Formerly Fx3
User avatar

Joined: Fri Nov 12, 2004 4:59 pm
Posts: 3155
Location: Brazil
I remember of Disch discussing about queue/dequeue PPU events after a certain amount of CPU cycles. Did he vanish from the forum... too?


Top
 Profile  
 
PostPosted: Wed Sep 12, 2018 12:31 pm 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 4105
For accuracy, you will need to have all the weird quirks happen.

Dummy Reads:
* Dummy Read when correcting the high byte of an address
* Dummy Read in read-modify write instructions
For example, Ironsword uses a dummy read to acknowledge an APU interrupt, otherwise it crashes on bootup.

Cycle penalties:
* Cycle Penalty when ABS+X/Y crosses to a new high byte
Battletoads will shake the screen without the cycle penalty

There are other quirks too.

Then to eliminate most scroll shaking issues, you need to have scroll writes apply as the write cycle of the instruction ends, this is usually the 4th cycle of an absolute write. Other IO writes also need to happen that time too.

Also, "Microcode" isn't the right term, as the 6502 is a hardwired processor that does not use microcode.

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
PostPosted: Wed Sep 12, 2018 2:13 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20770
Location: NE Indiana, USA (NTSC)
Dwedit wrote:
Also, "Microcode" isn't the right term, as the 6502 is a hardwired processor that does not use microcode.

I agree that more is hardwired in 6502's unstructured decode logic than in some of its contemporaries. But the 6502 does contain 130 lines of decode ROM.


Top
 Profile  
 
PostPosted: Wed Sep 12, 2018 2:24 pm 
Offline
User avatar

Joined: Mon Dec 29, 2014 1:46 pm
Posts: 822
Location: New York, NY
Dwedit wrote:
"Microcode" isn't the right term, as the 6502 is a hardwired processor that does not use microcode.


This topic has come up in the past and you're right, of course. But, for lack of a better term, "microcode" conveys the concept.


Top
 Profile  
 
PostPosted: Sat Sep 15, 2018 9:37 pm 
Offline

Joined: Fri Mar 02, 2018 12:22 am
Posts: 7
Thanks for the input guys. Looks like "microcodes" is indeed my next step to go any further in my accuracy quest. :)


Top
 Profile  
 
PostPosted: Sun Sep 16, 2018 6:07 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 10:59 pm
Posts: 1440
Note that you don't need to be able to pause the CPU mid-instruction - you can still have it execute entire instructions at a time, just as long as you advance the PPU and APU by the appropriate amounts each time the CPU performs a single memory access.

The "simplest" way is to just model all CPU instructions as sequences of memory reads and writes (including dummy reads/writes, as Dwedit noted) and have your "ReadMem"/"WriteMem" functions run the PPU and APU by a single CPU cycle, but you can optimize things further by just queueing up those cycles and "catching them up" whenever some special interaction would be triggered (e.g. the CPU accessing an I/O register, or the PPU/APU generating an interrupt which you can predict fairly easily).

_________________
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: Myself086 and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group