Cycle-by-cycle CPU

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

Post Reply
User avatar
zeroone
Posts: 934
Joined: Mon Dec 29, 2014 1:46 pm
Location: New York, NY
Contact:

Cycle-by-cycle CPU

Post by zeroone » Mon Mar 23, 2015 9:05 am

The longest instruction length is 8 CPU cycles. If the CPU processes each instruction atomically, then it will be out of sync with the PPU for as much as +/- 24 PPU cycles (2.5 tiles). This might explain why it is difficult to properly emulate games that switch the nametable mid-scanline, such as Marble Madness, without introducing an NMI delay hack.

Would a CPU that processes instructions cycle-by-cycle, doing the memory reads/writes on exactly the right cycles, actually resolve this? Would it perform well?

Alternatively, the PPU could be modified to read from the nametables at the last possible moment, just before the tile is drawn, as opposed to preparing shift registers with the values. I'm not sure what the side effects of such a change would be.

User avatar
thefox
Posts: 3141
Joined: Mon Jan 03, 2005 10:36 am
Location: Tampere, Finland
Contact:

Re: Cycle-by-cycle CPU

Post by thefox » Mon Mar 23, 2015 10:16 am

zeroone wrote:Would a CPU that processes instructions cycle-by-cycle, doing the memory reads/writes on exactly the right cycles, actually resolve this? Would it perform well?
Nintendulator does this.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi

tepples
Posts: 22055
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Cycle-by-cycle CPU

Post by tepples » Mon Mar 23, 2015 10:21 am

You don't have to process them, so long as the appropriate components receive the writes at the correct times. For example, for STA $2007, you don't have to actually turn the CPU into an explicit state machine and switch back and forth between the CPU and PPU after each CPU cycle, but you do have to make sure that the write occurs three CPU cycles after the start of the instruction. You can even just save all PPU and APU writes as a list of (address, data, timestamp) structs and then play them back several cycles later if that'll speed things up. (See Catch-up.)

User avatar
James
Posts: 429
Joined: Sat Jan 22, 2005 8:51 am
Location: Chicago, IL
Contact:

Re: Cycle-by-cycle CPU

Post by James » Mon Mar 23, 2015 3:39 pm

In practice, opcode-level granularity works fine for NES emulation. There are a few games that require cycle-level accuracy for NMI timing. I don't recall what they are, off the top of my head, but I don't think that Marble Madness is one of them. In any event, you can account for this without resorting to hacks: keep track of accumulated cpu cycles and when an NMI is triggered, you'll know whether it should be handled immediately or be delayed. We're only talking about a few cpu cycles difference, though. If you have to delay 50 cycles to get Marble Madness to work, something else is wrong.
get nemulator
http://nemulator.com

User avatar
zeroone
Posts: 934
Joined: Mon Dec 29, 2014 1:46 pm
Location: New York, NY
Contact:

Re: Cycle-by-cycle CPU

Post by zeroone » Mon Mar 23, 2015 5:26 pm

tepples wrote:You don't have to process them, so long as the appropriate components receive the writes at the correct times. For example, for STA $2007, you don't have to actually turn the CPU into an explicit state machine and switch back and forth between the CPU and PPU after each CPU cycle, but you do have to make sure that the write occurs three CPU cycles after the start of the instruction. You can even just save all PPU and APU writes as a list of (address, data, timestamp) structs and then play them back several cycles later if that'll speed things up. (See Catch-up.)
Memory writes generally appear to occur on the final cycle of the instruction. Doesn't that effectively mean to just run the CPU ahead of the PPU by 1 instruction?
James wrote:In practice, opcode-level granularity works fine for NES emulation. There are a few games that require cycle-level accuracy for NMI timing. I don't recall what they are, off the top of my head, but I don't think that Marble Madness is one of them. In any event, you can account for this without resorting to hacks: keep track of accumulated cpu cycles and when an NMI is triggered, you'll know whether it should be handled immediately or be delayed. We're only talking about a few cpu cycles difference, though. If you have to delay 50 cycles to get Marble Madness to work, something else is wrong.
I found that to get the text boxes to display properly in Marble Madness and to prevent level 2 of Battletoads from freezing, I have to delay the NMI by 7 CPU cycles.

Bisqwit
Posts: 248
Joined: Fri Oct 14, 2011 1:09 am

Re: Cycle-by-cycle CPU

Post by Bisqwit » Fri Apr 03, 2015 9:22 am

But reads can also have side effects, such as when reading from $2007 or $2002. And they don't always happen on the final cycle... When you get into RMW opcodes, you will have two writes by the same instruction on consecutive cycles.

Post Reply