CPU / PPU Timing with PPUDATA
Moderator: Moderators
-
- Posts: 17
- Joined: Fri Mar 18, 2016 3:59 am
CPU / PPU Timing with PPUDATA
Hi erverybody
I'm currently trying to develop a FPGA-Implementation of a NES and i need your help understanding the timing between the CPU and the PPU.
Let's say i write to PPUDATA (sta $2007), having PPUADDR at $00 and PPUCTRL.2 at 0, the CPU puts 2007 on the address-bus and set the ChipSelect-Pin for the PPU. Since the PPU runs 3 times faster than the CPU, the CS-Pin is low for 3 PPU-Cycles. But when i read from VRAM on every PPU-Cycle and increment PPUADDR according to PPUCTRL.2 when /CS is low i will get the totally wrong behavior (an increment by 3). How does the actual PPU work that it increments PPUADDR only one time when the signals at the address-bus and the /CS-Pin stay for more than 1 cycle?
Am i going right that every 3rd rising edge of the PPU-Clock is in sync with the rising edge of the CPU-Clock?
Greetings
Chris
I'm currently trying to develop a FPGA-Implementation of a NES and i need your help understanding the timing between the CPU and the PPU.
Let's say i write to PPUDATA (sta $2007), having PPUADDR at $00 and PPUCTRL.2 at 0, the CPU puts 2007 on the address-bus and set the ChipSelect-Pin for the PPU. Since the PPU runs 3 times faster than the CPU, the CS-Pin is low for 3 PPU-Cycles. But when i read from VRAM on every PPU-Cycle and increment PPUADDR according to PPUCTRL.2 when /CS is low i will get the totally wrong behavior (an increment by 3). How does the actual PPU work that it increments PPUADDR only one time when the signals at the address-bus and the /CS-Pin stay for more than 1 cycle?
Am i going right that every 3rd rising edge of the PPU-Clock is in sync with the rising edge of the CPU-Clock?
Greetings
Chris
Re: CPU / PPU Timing with PPUDATA
Let's start off with something first: the NES was not intentionally set to run in a unified clock domain. Everything was designed as using standard 1980s asynchronous logic, and it just happens to be something that can be thought of synchronous behavior on edges.
For your purposes, you can think that it does the read on the rising edge of PPU/CS.
You should try playing around with Visual2C02. In this case, a write to PPUDATA drives node "/w2007" low, which starts an FSM (starting with node "write_2007_trigger")
Pedantically, it's the 74'139 on the mainboard that drives the PPU's chip select.Feuerwerk42 wrote:Let's say i write to PPUDATA (sta $2007), having PPUADDR at $00 and PPUCTRL.2 at 0, the CPU puts 2007 on the address-bus and set the ChipSelect-Pin for the PPU.
The PPU's /CS pin includes M2, so it's only low for 1 ⅞ cycles. Alternatively, it's low for 7½ master clock cycles.Since the PPU runs 3 times faster than the CPU, the CS-Pin is low for 3 PPU-Cycles.
It doesn't do the read on every edge of the pixel clock. Nor is it doing the read on every edge of the master clock.But when i read from VRAM on every PPU-Cycle and increment PPUADDR according to PPUCTRL.2 when /CS is low i will get the totally wrong behavior (an increment by 3).
For your purposes, you can think that it does the read on the rising edge of PPU/CS.
You should try playing around with Visual2C02. In this case, a write to PPUDATA drives node "/w2007" low, which starts an FSM (starting with node "write_2007_trigger")
-
- Posts: 17
- Joined: Fri Mar 18, 2016 3:59 am
Re: CPU / PPU Timing with PPUDATA
Hi lidnariq
I found both nodes and see their changes but i can't figure out what happens in consequence using the diagram.
Wow, i never thought that i takes 6 ppu cycles to read from VRAM. But that opens another big question to me:
The IO-Ports of the PPU are accessed by the CPU like normal RAM (e.g. LDA $2007)
In CPU cycle 1 the address 2007 is put on the address bus, so address 7 arrives at the PPU
About 1 Halfcycle later /CS at PPU goes low (triggered by 74'139) which starts the FSM to read from VRAM (if i understand Visual2C02 correctly)
The CPU now expects the read VRAM value at the end of cpu cycle 2. That gives the PPU about 3 ppu cycles or 12 master cycles to read from VRAM and return the requested value on the databus to be in time. But Visual2C02 show me that it takes 6 ppu cycles. How can that fit together?
Greetings
Chris
You're right, that clears some of my misunderstanding of the relation between CPU and PPU.lidnariq wrote:Let's start off with something first: the NES was not intentionally set to run in a unified clock domain. Everything was designed as using standard 1980s asynchronous logic, and it just happens to be something that can be thought of synchronous behavior on edges.
The PPU's /CS pin includes M2, so it's only low for 1 ⅞ cycles. Alternatively, it's low for 7½ master clock cycles.
Ok, Visual2C02 and i are still not friends but now i understand reading and writing to the IO-Ports a bit better.lidnariq wrote:You should try playing around with Visual2C02. In this case, a write to PPUDATA drives node "/w2007" low, which starts an FSM (starting with node "write_2007_trigger")
I found both nodes and see their changes but i can't figure out what happens in consequence using the diagram.
Wow, i never thought that i takes 6 ppu cycles to read from VRAM. But that opens another big question to me:
The IO-Ports of the PPU are accessed by the CPU like normal RAM (e.g. LDA $2007)
In CPU cycle 1 the address 2007 is put on the address bus, so address 7 arrives at the PPU
About 1 Halfcycle later /CS at PPU goes low (triggered by 74'139) which starts the FSM to read from VRAM (if i understand Visual2C02 correctly)
The CPU now expects the read VRAM value at the end of cpu cycle 2. That gives the PPU about 3 ppu cycles or 12 master cycles to read from VRAM and return the requested value on the databus to be in time. But Visual2C02 show me that it takes 6 ppu cycles. How can that fit together?
Greetings
Chris
Re: CPU / PPU Timing with PPUDATA
Reading from VRAM is delayed by an entire instruction. Reads go into some sort of "buffer", then come out on the next read. Reads will first return whatever is sitting on the bus (usually the contents of the last read, or something that was just written to some PPU register), then the next read will return the requested value.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
Re: CPU / PPU Timing with PPUDATA
Reads from $2007 come from a dedicated 8-bit latch (inbuf0..7 in Visual2C02), while reads from any write-only register retrieve whatever's on the internal data bus. It's writes to $2007 that float on the bus until the PPU is ready to write them to VRAM.Dwedit wrote:Reading from VRAM is delayed by an entire instruction. Reads go into some sort of "buffer", then come out on the next read. Reads will first return whatever is sitting on the bus (usually the contents of the last read, or something that was just written to some PPU register), then the next read will return the requested value.
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.
P.S. If you don't get this note, let me know and I'll write you another.
Re: CPU / PPU Timing with PPUDATA
Didn't realize there was an actual latch for this, thanks.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
Re: CPU / PPU Timing with PPUDATA
EDITED: fixed the order! The CPU read gets the last latched value!
EDIT 2: fixed the latch concept, thanks Q.
I wasn't understanding Q in a first moment... then I got it.
EDIT 2: fixed the latch concept, thanks Q.
I wasn't understanding Q in a first moment... then I got it.

Code: Select all
value = latch_2007
latch_2007 = Read($2007)
-------
Write($200x) //latch_200x gets the last value written.
Read(200x):
value = latch_200x
Last edited by Zepper on Fri Mar 25, 2016 6:13 am, edited 2 times in total.
-
- Posts: 17
- Joined: Fri Mar 18, 2016 3:59 am
Re: CPU / PPU Timing with PPUDATA
Ahhh, that clears a lot.
Thank you guys for your help.
Thank you guys for your help.

Re: CPU / PPU Timing with PPUDATA
To clarify, the $2007 read latch is only returned when you actually read from $2007 - if you read any other write-only register, you get a different "latch" which is actually just whatever values were sitting on the PPU's internal data bus (and is generally the same as the last value you read/wrote from/to any of the other "valid" registers).Zepper wrote:EDITED: fixed the order! The CPU read gets the last latched value!
I wasn't understanding Q in a first moment... then I got it.![]()
Code: Select all
value = latch latch = Read($2007) ------- Read($200x) //read and throw it away. value = latch
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.
P.S. If you don't get this note, let me know and I'll write you another.
Re: CPU / PPU Timing with PPUDATA
Yes, I edited my original post. According to my notes, probably from blargg:Quietust wrote:To clarify, the $2007 read latch is only returned when you actually read from $2007 - if you read any other write-only register, you get a different "latch" which is actually just whatever values were sitting on the PPU's internal data bus (and is generally the same as the last value you read/wrote from/to any of the other "valid" registers).
Code: Select all
PPU read
Addr Open-bus bits
7654 3210
- - - - - - - - - - - - - - - -
$2000 DDDD DDDD
$2001 DDDD DDDD
$2002 ---D DDDD
$2003 DDDD DDDD
$2004 ---- ----
$2005 DDDD DDDD
$2006 DDDD DDDD
$2007 ---- ---- non-palette
DD-- ---- palette