Manual OAM write glitchyness
Moderator: Moderators
Manual OAM write glitchyness
There was some discussion about manual $2004 writes being glitchy. I did more testing and it seems that it's only $2003 writes (SPRADDR) that are glitchy; merely changing SPRADDR often corrupts OAM. That's fine if you're about to update all of OAM, but it causes havoc when writing test code. I've found a reliable approach is to only use $2004 reads and writes for test code:
* Do initial $2003 write to set address.
* Write to OAM, e.g. 256 $2004 writes, either manually or via $4014
* Read OAM by reading $2004, then writing that back to $2004 to increment address, 256 times
Each step leaves the address at whatever it started as, avoiding the need for a $2003 write.
I have some code put a random value in $2003, fill OAM with random bytes, then read them back to verify. Masking off the missing bits in every third byte of a sprite, they are consistently matching on all four CPU-PPU alignments (just ran 250 iterations for each of the four alignments and they all passed).
So this might at least narrow down the OAM funkyness we've encountered in the past. I'm going to try some more tests to be sure $2004 doesn't have any weirdness in any alignment.
* Do initial $2003 write to set address.
* Write to OAM, e.g. 256 $2004 writes, either manually or via $4014
* Read OAM by reading $2004, then writing that back to $2004 to increment address, 256 times
Each step leaves the address at whatever it started as, avoiding the need for a $2003 write.
I have some code put a random value in $2003, fill OAM with random bytes, then read them back to verify. Masking off the missing bits in every third byte of a sprite, they are consistently matching on all four CPU-PPU alignments (just ran 250 iterations for each of the four alignments and they all passed).
So this might at least narrow down the OAM funkyness we've encountered in the past. I'm going to try some more tests to be sure $2004 doesn't have any weirdness in any alignment.
Last edited by blargg on Wed Jun 19, 2013 9:44 pm, edited 2 times in total.
Re: Manual OAM write glitchyness
That's very interesting. Seems almost obvious now that you said it, given that pretty much all "broken" applications write to $2003, then only a couple of bytes to $2004. The simplest explanation tends to be the correct one.
Now if we could figure out what bytes and how the $2003 write corrupts, we may be able to find a way to do less than 256 $2004 writes in a safe way for the applications that don't want to use the OAM DMA.
Now if we could figure out what bytes and how the $2003 write corrupts, we may be able to find a way to do less than 256 $2004 writes in a safe way for the applications that don't want to use the OAM DMA.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
Re: Manual OAM write glitchyness
Wonder if Quietust can work out or find this particular behaviour in Visual 2A03, to narrow down exactly what the behaviour is.
Re: Manual OAM write glitchyness
I have some notes about what happens on a $2003 write. My idea is that there's some refresh logic in OAM that is continually reading/writing a chunk of it ($10 bytes? $20 bytes? I forget). When you write to $2003, it writes the chunk at the wrong place.
I just made a random test that runs 65536 iterations. Each one, it randomly chooses one of these actions:
* write a random value to $2003 and then do a DMA with random data (to overwrite any corruption from the $2003 write)
* do a DMA with random data
* write a random value to $2004
* read from $2004
When doing these it also simulates what should happen, and verifies that $2004 reads match and also periodically verifies that its simulated OAM matches actual OAM. This reminds me of the triangle linear counter test that did similar things. Did this for all four alignments (took about 15 minutes total). All passed. So this pretty much verifies that $2004 and $4014 behave in a sane manner, just $2003 corrupting OAM.
I just made a random test that runs 65536 iterations. Each one, it randomly chooses one of these actions:
* write a random value to $2003 and then do a DMA with random data (to overwrite any corruption from the $2003 write)
* do a DMA with random data
* write a random value to $2004
* read from $2004
When doing these it also simulates what should happen, and verifies that $2004 reads match and also periodically verifies that its simulated OAM matches actual OAM. This reminds me of the triangle linear counter test that did similar things. Did this for all four alignments (took about 15 minutes total). All passed. So this pretty much verifies that $2004 and $4014 behave in a sane manner, just $2003 corrupting OAM.
Re: Manual OAM write glitchyness
Seems that a $2003 write basically does this:
* Take old value from $2003 and AND it with $F8
* Read 8 bytes from OAM starting at this masked value
* Write them starting at $XX in OAM, where $XX is the high byte of the PPU register written to ($20-$3F) masked with $F8
* Use new value written to $2003 as OAM address
So if $2003 was 0 and you wrote $3C to $2003, it'd copy $00-$07 to $20-$27 in OAM, then set the address to $3C. If you then wrote $00 to $2F03, it'd copy $38-$3F to $28-$2F, then set the address back to $00.
But this is just for the "preferred" CPU-PPU alignment. For another, I get totally different corruptions at portions of OAM related to the new value written. It's probably using a different value to write the 8-byte chunk to OAM.
* Take old value from $2003 and AND it with $F8
* Read 8 bytes from OAM starting at this masked value
* Write them starting at $XX in OAM, where $XX is the high byte of the PPU register written to ($20-$3F) masked with $F8
* Use new value written to $2003 as OAM address
So if $2003 was 0 and you wrote $3C to $2003, it'd copy $00-$07 to $20-$27 in OAM, then set the address to $3C. If you then wrote $00 to $2F03, it'd copy $38-$3F to $28-$2F, then set the address back to $00.
But this is just for the "preferred" CPU-PPU alignment. For another, I get totally different corruptions at portions of OAM related to the new value written. It's probably using a different value to write the 8-byte chunk to OAM.
- rainwarrior
- Posts: 8732
- Joined: Sun Jan 22, 2012 12:03 pm
- Location: Canada
- Contact:
Re: Manual OAM write glitchyness
Don't forget this thread: "Just how cranky is the PPU OAM?" viewtopic.php?f=9&t=9912
One thing about writing all of OAM via $2004 is that in my tests it seemed that I needed to get it done rather quickly (e.g. unrolled lda imm + sta) and get rendering turned on, or else it would take too long and the data would degrade. (Seemed like data lifetime was just barely long enough to reliably survive vblank.)
One thing about writing all of OAM via $2004 is that in my tests it seemed that I needed to get it done rather quickly (e.g. unrolled lda imm + sta) and get rendering turned on, or else it would take too long and the data would degrade. (Seemed like data lifetime was just barely long enough to reliably survive vblank.)
Re: Manual OAM write glitchyness
Assuming the 2C07 is using the same technology and the same construction for the DRAM cells, it should have the same self-discharge time. Combine that with thefox's comment in this thread about OAM evaluation automatically starting after 20 scanlines on the 2C07, that implies the maximum time between refreshes permissible for the OAM DRAM is between 1.3ms (20 scanlines, but that's probably chosen because it's the same as the 2C02) and 4.5ms (70 scanlines). I'd bet good odds that it's roughly 2ms, given my research about the Intel C2116.rainwarrior wrote:or else it would take too long and the data would degrade. (Seemed like data lifetime was just barely long enough to reliably survive vblank.)
Huh. I wonder where the data bus (since the MSB of the address is fetched last) is incorrectly getting mixed in? Is there a little bit of race on either the rising or falling edge of M2 or PPU/CE?blargg wrote:So if $2003 was 0 and you wrote $3C to $2003, it'd copy $00-$07 to $20-$27 in OAM, then set the address to $3C. If you then wrote $00 to $2F03, it'd copy $38-$3F to $28-$2F, then set the address back to $00.
I bet the "copying" effect is due to the relative size of the capacitance of the DRAM cells (small) compared to the interconnect bus. Maybe pclk0 doesn't go high in between the original value being driven onto the interconnect, and swamps the desired value when the new address gates are enabled?
Re: Manual OAM write glitchyness
That sounds quite similar to what I described here, and I also wrote a test program to try and play with it more. In particular, it seemed that it wasn't just writing $2003 that did it, but waiting a while between writes to $2003, during which the values in the "destination" row could decay just enough to be overwritten by the values sitting on the bit lines (initialized from the "source" row).lidnariq wrote:I bet the "copying" effect is due to the relative size of the capacitance of the DRAM cells (small) compared to the interconnect bus. Maybe pclk0 doesn't go high in between the original value being driven onto the interconnect, and swamps the desired value when the new address gates are enabled?
I'm not quite sure how the "high byte of the PPU register written to ($20-$3F)" could possibly be involved here, since the PPU I/O port would be inactive while the CPU is reading the instruction bytes from memory.
I assume you meant the Visual 2C02, since this is issue has nothing to do with the CPU.koitsu wrote:Wonder if Quietust can work out or find this particular behaviour in Visual 2A03, to narrow down exactly what the behaviour is.
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.
P.S. If you don't get this note, let me know and I'll write you another.
Re: Manual OAM write glitchyness
Not that kind of open bus
If the CPU drives A15-A13 before D7-D0, the 74LS139 is generating a chip enable for the PPU while the instruction byte is still sitting on the data bus.Quietust wrote:I'm not quite sure how the "high byte of the PPU register written to ($20-$3F)" could possibly be involved here, since the PPU I/O port would be inactive while the CPU is reading the instruction bytes from memory.
That depends on to what extent the Visual 2A03 simulates bus capacitance.I assume you meant the Visual 2C02, since this is issue has nothing to do with the CPU.koitsu wrote:Wonder if Quietust can work out or find this particular behaviour in Visual 2A03, to narrow down exactly what the behaviour is.
Re: Manual OAM write glitchyness
Does this have anything to do with the OAM corruption caused by disabling rendering mid-frame? In that situation, the corruption is also 8 bytes long, isn't it?blargg wrote:* Read 8 bytes from OAM starting at this masked value
* Write them starting at $XX in OAM, where $XX is the high byte of the PPU register written to ($20-$3F) masked with $F8
- rainwarrior
- Posts: 8732
- Joined: Sun Jan 22, 2012 12:03 pm
- Location: Canada
- Contact:
Re: Manual OAM write glitchyness
I didn't figure out the precise time (and I presume there is a temperature factor I haven't even touched on), but 2ms was still too long in my tests with the NTSC PPU (i.e. enough cycles to LDA,X+STA+INX+BNE 256 times). So 2ms is an upper bound, and theoretically 1.3m is the lower bound (unless there's some unknown refresh behaviour during vblank).lidnariq wrote:Assuming the 2C07 is using the same technology and the same construction for the DRAM cells, it should have the same self-discharge time. Combine that with thefox's comment in this thread about OAM evaluation automatically starting after 20 scanlines on the 2C07, that implies the maximum time between refreshes permissible for the OAM DRAM is between 1.3ms (20 scanlines, but that's probably chosen because it's the same as the 2C02) and 4.5ms (70 scanlines). I'd bet good odds that it's roughly 2ms, given my research about the Intel C2116.
Re: Manual OAM write glitchyness
BTW, do these results affect the earlier OAM readback tests (viewtopic.php?f=2&t=6424) because OAM readback requires the OAM address to be rewritten for each byte that is read?
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
Re: Manual OAM write glitchyness
I recently wrote a test program that would write 1-15 bytes to OAM_DATA in NMI, but only when the joypad changes. (And never touching OAM_ADDR, relying on its being reset at the end of rendering)
Currently, Nestopia assumes that if you start rendering with OAM_ADDR at 8 or more, sprites 0 and 1 are temporarily replaced with the pair that was pointed to. This does not seem to be the case: it appears to be the same copying behavior we've seen above when OAM_ADDR is manually changed.
Currently, Nestopia assumes that if you start rendering with OAM_ADDR at 8 or more, sprites 0 and 1 are temporarily replaced with the pair that was pointed to. This does not seem to be the case: it appears to be the same copying behavior we've seen above when OAM_ADDR is manually changed.
Re: Manual OAM write glitchyness
I released the test program over in this thread...
And after having taken and annotated a few photographs (nominally to help FHorse implement this, but in practice to just explain to myself the difference in behavior), I figured I already went to the effort and should include them here:
First I configure it to upload these 14 bytes (34 56 78 9a bc de f0 00 54 32 10 ec a8 64): And then I shorten it to 7 bytes:
And after having taken and annotated a few photographs (nominally to help FHorse implement this, but in practice to just explain to myself the difference in behavior), I figured I already went to the effort and should include them here:
First I configure it to upload these 14 bytes (34 56 78 9a bc de f0 00 54 32 10 ec a8 64): And then I shorten it to 7 bytes: