It is currently Mon Oct 23, 2017 3:49 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 14 posts ] 
Author Message
PostPosted: Wed Jun 19, 2013 8:34 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
There was some discussion about manual $2004 writes being glitchy. I did more testing and it seems that it's only $2003 writes (SPRADDR) that are glitchy; merely changing SPRADDR often corrupts OAM. That's fine if you're about to update all of OAM, but it causes havoc when writing test code. I've found a reliable approach is to only use $2004 reads and writes for test code:

* Do initial $2003 write to set address.
* Write to OAM, e.g. 256 $2004 writes, either manually or via $4014
* Read OAM by reading $2004, then writing that back to $2004 to increment address, 256 times

Each step leaves the address at whatever it started as, avoiding the need for a $2003 write.

I have some code put a random value in $2003, fill OAM with random bytes, then read them back to verify. Masking off the missing bits in every third byte of a sprite, they are consistently matching on all four CPU-PPU alignments (just ran 250 iterations for each of the four alignments and they all passed).

So this might at least narrow down the OAM funkyness we've encountered in the past. I'm going to try some more tests to be sure $2004 doesn't have any weirdness in any alignment.


Last edited by blargg on Wed Jun 19, 2013 9:44 pm, edited 2 times in total.

Top
 Profile  
 
PostPosted: Wed Jun 19, 2013 9:23 pm 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2963
Location: Tampere, Finland
That's very interesting. Seems almost obvious now that you said it, given that pretty much all "broken" applications write to $2003, then only a couple of bytes to $2004. The simplest explanation tends to be the correct one.

Now if we could figure out what bytes and how the $2003 write corrupts, we may be able to find a way to do less than 256 $2004 writes in a safe way for the applications that don't want to use the OAM DMA.

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Top
 Profile  
 
PostPosted: Wed Jun 19, 2013 9:26 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
Wonder if Quietust can work out or find this particular behaviour in Visual 2A03, to narrow down exactly what the behaviour is.


Top
 Profile  
 
PostPosted: Wed Jun 19, 2013 9:46 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
I have some notes about what happens on a $2003 write. My idea is that there's some refresh logic in OAM that is continually reading/writing a chunk of it ($10 bytes? $20 bytes? I forget). When you write to $2003, it writes the chunk at the wrong place.

I just made a random test that runs 65536 iterations. Each one, it randomly chooses one of these actions:
* write a random value to $2003 and then do a DMA with random data (to overwrite any corruption from the $2003 write)
* do a DMA with random data
* write a random value to $2004
* read from $2004

When doing these it also simulates what should happen, and verifies that $2004 reads match and also periodically verifies that its simulated OAM matches actual OAM. This reminds me of the triangle linear counter test that did similar things. Did this for all four alignments (took about 15 minutes total). All passed. So this pretty much verifies that $2004 and $4014 behave in a sane manner, just $2003 corrupting OAM.


Top
 Profile  
 
PostPosted: Wed Jun 19, 2013 10:59 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
Seems that a $2003 write basically does this:

* Take old value from $2003 and AND it with $F8
* Read 8 bytes from OAM starting at this masked value
* Write them starting at $XX in OAM, where $XX is the high byte of the PPU register written to ($20-$3F) masked with $F8
* Use new value written to $2003 as OAM address

So if $2003 was 0 and you wrote $3C to $2003, it'd copy $00-$07 to $20-$27 in OAM, then set the address to $3C. If you then wrote $00 to $2F03, it'd copy $38-$3F to $28-$2F, then set the address back to $00.

But this is just for the "preferred" CPU-PPU alignment. For another, I get totally different corruptions at portions of OAM related to the new value written. It's probably using a different value to write the 8-byte chunk to OAM.


Top
 Profile  
 
PostPosted: Thu Jun 20, 2013 12:17 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
Location: Canada
Don't forget this thread: "Just how cranky is the PPU OAM?" http://forums.nesdev.com/viewtopic.php?f=9&t=9912

One thing about writing all of OAM via $2004 is that in my tests it seemed that I needed to get it done rather quickly (e.g. unrolled lda imm + sta) and get rendering turned on, or else it would take too long and the data would degrade. (Seemed like data lifetime was just barely long enough to reliably survive vblank.)


Top
 Profile  
 
PostPosted: Thu Jun 20, 2013 1:11 am 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 6303
Location: Seattle
rainwarrior wrote:
or else it would take too long and the data would degrade. (Seemed like data lifetime was just barely long enough to reliably survive vblank.)
Assuming the 2C07 is using the same technology and the same construction for the DRAM cells, it should have the same self-discharge time. Combine that with thefox's comment in this thread about OAM evaluation automatically starting after 20 scanlines on the 2C07, that implies the maximum time between refreshes permissible for the OAM DRAM is between 1.3ms (20 scanlines, but that's probably chosen because it's the same as the 2C02) and 4.5ms (70 scanlines). I'd bet good odds that it's roughly 2ms, given my research about the Intel C2116.

blargg wrote:
So if $2003 was 0 and you wrote $3C to $2003, it'd copy $00-$07 to $20-$27 in OAM, then set the address to $3C. If you then wrote $00 to $2F03, it'd copy $38-$3F to $28-$2F, then set the address back to $00.
Huh. I wonder where the data bus (since the MSB of the address is fetched last) is incorrectly getting mixed in? Is there a little bit of race on either the rising or falling edge of M2 or PPU/CE?

I bet the "copying" effect is due to the relative size of the capacitance of the DRAM cells (small) compared to the interconnect bus. Maybe pclk0 doesn't go high in between the original value being driven onto the interconnect, and swamps the desired value when the new address gates are enabled?


Top
 Profile  
 
PostPosted: Thu Jun 20, 2013 7:53 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 10:59 pm
Posts: 1390
lidnariq wrote:
I bet the "copying" effect is due to the relative size of the capacitance of the DRAM cells (small) compared to the interconnect bus. Maybe pclk0 doesn't go high in between the original value being driven onto the interconnect, and swamps the desired value when the new address gates are enabled?

That sounds quite similar to what I described here, and I also wrote a test program to try and play with it more. In particular, it seemed that it wasn't just writing $2003 that did it, but waiting a while between writes to $2003, during which the values in the "destination" row could decay just enough to be overwritten by the values sitting on the bit lines (initialized from the "source" row).

I'm not quite sure how the "high byte of the PPU register written to ($20-$3F)" could possibly be involved here, since the PPU I/O port would be inactive while the CPU is reading the instruction bytes from memory.

koitsu wrote:
Wonder if Quietust can work out or find this particular behaviour in Visual 2A03, to narrow down exactly what the behaviour is.

I assume you meant the Visual 2C02, since this is issue has nothing to do with the CPU.

_________________
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.


Top
 Profile  
 
PostPosted: Thu Jun 20, 2013 8:05 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19122
Location: NE Indiana, USA (NTSC)
Image
Not that kind of open bus


Quietust wrote:
I'm not quite sure how the "high byte of the PPU register written to ($20-$3F)" could possibly be involved here, since the PPU I/O port would be inactive while the CPU is reading the instruction bytes from memory.

If the CPU drives A15-A13 before D7-D0, the 74LS139 is generating a chip enable for the PPU while the instruction byte is still sitting on the data bus.

Quote:
koitsu wrote:
Wonder if Quietust can work out or find this particular behaviour in Visual 2A03, to narrow down exactly what the behaviour is.

I assume you meant the Visual 2C02, since this is issue has nothing to do with the CPU.

That depends on to what extent the Visual 2A03 simulates bus capacitance.


Top
 Profile  
 
PostPosted: Thu Jun 20, 2013 8:26 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
blargg wrote:
* Read 8 bytes from OAM starting at this masked value
* Write them starting at $XX in OAM, where $XX is the high byte of the PPU register written to ($20-$3F) masked with $F8

Does this have anything to do with the OAM corruption caused by disabling rendering mid-frame? In that situation, the corruption is also 8 bytes long, isn't it?


Top
 Profile  
 
PostPosted: Thu Jun 20, 2013 8:27 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
Location: Canada
lidnariq wrote:
Assuming the 2C07 is using the same technology and the same construction for the DRAM cells, it should have the same self-discharge time. Combine that with thefox's comment in this thread about OAM evaluation automatically starting after 20 scanlines on the 2C07, that implies the maximum time between refreshes permissible for the OAM DRAM is between 1.3ms (20 scanlines, but that's probably chosen because it's the same as the 2C02) and 4.5ms (70 scanlines). I'd bet good odds that it's roughly 2ms, given my research about the Intel C2116.


I didn't figure out the precise time (and I presume there is a temperature factor I haven't even touched on), but 2ms was still too long in my tests with the NTSC PPU (i.e. enough cycles to LDA,X+STA+INX+BNE 256 times). So 2ms is an upper bound, and theoretically 1.3m is the lower bound (unless there's some unknown refresh behaviour during vblank).


Top
 Profile  
 
PostPosted: Tue Jun 25, 2013 12:08 pm 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2963
Location: Tampere, Finland
BTW, do these results affect the earlier OAM readback tests (viewtopic.php?f=2&t=6424) because OAM readback requires the OAM address to be rewritten for each byte that is read?

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Top
 Profile  
 
PostPosted: Tue Jul 09, 2013 12:50 am 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 6303
Location: Seattle
I recently wrote a test program that would write 1-15 bytes to OAM_DATA in NMI, but only when the joypad changes. (And never touching OAM_ADDR, relying on its being reset at the end of rendering)

Currently, Nestopia assumes that if you start rendering with OAM_ADDR at 8 or more, sprites 0 and 1 are temporarily replaced with the pair that was pointed to. This does not seem to be the case: it appears to be the same copying behavior we've seen above when OAM_ADDR is manually changed.


Top
 Profile  
 
PostPosted: Sun May 18, 2014 7:20 pm 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 6303
Location: Seattle
I released the test program over in this thread...
And after having taken and annotated a few photographs (nominally to help FHorse implement this, but in practice to just explain to myself the difference in behavior), I figured I already went to the effort and should include them here:
First I configure it to upload these 14 bytes (34 56 78 9a bc de f0 00 54 32 10 ec a8 64):
Attachment:
hwE.png
hwE.png [ 35.36 KiB | Viewed 3002 times ]
And then I shorten it to 7 bytes:
Attachment:
hw7.png
hw7.png [ 31.18 KiB | Viewed 3002 times ]


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: Bing [Bot], Google Adsense [Bot] and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group