It is currently Sat Aug 24, 2019 6:07 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 40 posts ]  Go to page Previous  1, 2, 3
Author Message
PostPosted: Sun Jan 13, 2019 6:39 am 
Offline

Joined: Sat Nov 18, 2017 9:15 pm
Posts: 74
I've made some progress on the issue with tiles disappearing and have attached a test for this. What I've found on my hardware is that by loading a value from zero page in a loop, bit 7 of the ID of some sprite tiles can become set without any clear cause. Which tiles are affected, if any, depends on the value being loaded from memory.

In the attached test, the dpad can be used to adjust the value (up/down by #$10, left/right by #$01). The current value is displayed using background tiles. On my console+Everdrive, some multiples of #$10 (value #$20, for example) reliably cause immediate change in tile ID. This is seen as a change in sprite color because tile IDs #$01 and #$81 are differently colored squares. All other sprite tiles are fully transparent in this version. The test works by having a tight loop where the value is loaded from zero page. The NMI handler does joypad reads, value adjustment, and background updates. OAM DMA for the diagonal line of sprites is only performed once, just before rendering is first enabled. The NMI doesn't seem to impact the OAM corruption; the glitching still occurs when setting the address to #$20 before the loop and never enabling NMIs.

What I have not yet tested, but intend to, is whether the zero page address matters. In my development of this test while things were still very much in flux, I was unable to get glitching when loading from $00 (the test loads from $1A), loading from an absolute addresses, doing nothing, or writing to zero page, but did get glitching with indexed zero page reads. Consider those results tentative, however; I plan to do more testing on those now that I have the test in a good state.

thefox's test encountered this issue because the NMI incremented a zero page frame counter, which was being waited on outside the NMI for the majority of the frame. Tiles would disappear at different times because of the time it would take for the counter to reach the value that affects those tiles. The other issue I've seen in his test (2 tiles missing on reset) is also present in this test. I've looked into it only a little bit. I know that the tile IDs aren't getting corrupted because the tiles remain invisible when I ensure all sprite tiles are non-transparent. I also encountered some resets where the tiles were visible, but at incorrect positions.

Finally, I asked BMF54123 to run this on his PowerPak and Everdrive. He was unable to reproduce the issue at all on either of two frontloader consoles (E and G revisions, I believe) or AV Famicom, though he did see the 2 missing tiles on reset on both frontloaders with both carts and the Famicom with the Everdrive (but had trouble reproducing the behavior on Famicom+Everdrive when going back to it). Given that changing to another AC adapter fixed similar glitching I've seen with bit 7 getting set on lag frames, I asked BMF if he could try another, but all of his have new caps installed. I still don't know if the power supply is a red herring or not, but I do find it interesting that it didn't happen on any console he tried, but happened for me and (with thefox's test) the person who I clipped on Twitch.


That's what I've got for now. Not too sure how to proceed on the tile ID bit 7 corruption issue at this point.


Attachments:
oam_corruption_test.zip [3.16 KiB]
Downloaded 170 times
Top
 Profile  
 
PostPosted: Mon Jan 14, 2019 2:57 am 
Offline

Joined: Sat Nov 18, 2017 9:15 pm
Posts: 74
I decided to look into when reads need to be done in order to cause corruption. I first added a variable sprite 0 hit that controlled when the reads would begin and found that the corruption did not occur regardless of what visible scanline I started doing the reads on. I changed it to add a variable delay to the NMI handler to reduce amount of vblank time available to the read loop and found that this does indeed control whether the corruption happens. The size of the read loop window within vblank and the density of reads in that window determine whether corruption occurs and which / how many tiles are affected.

I've attached a new version of the test. This contains two ROMs. oam_corruption_stress_test.nes is a stress test that reads a frame counter in a fairly unrolled loop (63 zero page loads per loop) for almost all of vblank. In my testing, most corruption is instantaneous, so the frame counter simply increments once per frame and should cause most vulnerable tiles to corrupt within a few seconds. oam_corruption_test_v2.nes uses the same unrolled read loop, but allows the value being read to be controlled by the user via the dpad as well as a delay (9 CPU cycles per unit) via B/A. The delay can be used to find the smallest vblank window capable of corruption.

Note that both of these ROMs pack loads in tighter than the first version of the test, so corruption should be more likely. In my testing, the lowest delay I can go without seeing any corruption at all is 97, which means we start the read loop at about scanline 255 dot 240 (give or take typical NMI jitter). That window is only about 1/4 of vblank.


Edit: I got my hands on a Genesis model 1 AC adapter and gave that a try. Literally no difference; 97 is still the lowest I can go without corruption.


Attachments:
oam_corruption_test_v2.zip [6.14 KiB]
Downloaded 171 times
Top
 Profile  
 
PostPosted: Mon Jan 14, 2019 11:42 am 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8538
Location: Seattle
I cannot understand the physics of how you're getting these results.

Visual2C02 shows that the external data bus ("io_dbX") is only copied onto the internal data bus ("_dbX") when "_io_dbe" is asserted. "_io_dbe" is only asserted when the PPU is selected ("_io_ce" is low) and R/W is low.

"_io_ce" externally comes from the 74'139, and is only true while M2 and A13 are high and A14 and A15 are low. So I can see no mechanism why access to just zero page would cause this—your code is executing from $Cxxx, so at no point should A13 even bounce high for a glitchy moment....


Top
 Profile  
 
PostPosted: Tue Jan 15, 2019 1:01 am 
Offline

Joined: Sat Nov 18, 2017 9:15 pm
Posts: 74
Thanks for looking into this at all, though I was hoping it might make more sense to you! There's still a little more testing I can do on this, but I'm definitely reaching a point where I probably need to make a dev board to see how this behaves without a flash cart involved. I'll have to look into what parts I need for that.

Tonight, Eunos streamed these tests on Twitch for me, so I have some results to share from that. He was able to test an AV Famicom with a Famicom Everdrive, and two frontloaders with an NES Everdrive. Like with BMF54123's testing (also done on a Famicom Everdrive), the Famicom didn't exhibit any unusual behavior (though we didn't test for the 2 missing tiles on reset, which BMF did see). However, the frontloaders showed a tremendous amount of corruption of higher bits for ID and position. For tile colors, light blue is #$01, dark blue is #$81, and all other tiles are a mid-blue, which shows up in the first clip. Disregard the capture issue on the right of some of the clips, which is unrelated to the NES.

Frontloader 1 stress test (shows tile ID and position corruption)
Frontloader 2 stress test (shows tile ID bit 7 being both set and cleared)
Frontloader 1 delay test (corrupts around C1, ~scanline 258, cycle 339)
Frontloader 2 delay test (corrupts around 97, similar to my console)
Glitching in Zelda on Everdrive (left side of Link sprite when initiating scrolls, corresponding to 2 frames in normal overworld scrolling)

He also tried 3 different AC adapters on the frontloaders, but it made no clear difference, so that seems pretty unrelated. The full VOD is here for now, but the clips above are the juicy bits.


Last edited by Fiskbit on Tue Jan 15, 2019 2:03 am, edited 1 time in total.

Top
 Profile  
 
PostPosted: Tue Jan 15, 2019 2:00 am 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8538
Location: Seattle
For no good reason, the only flashcart I ever bothered to build was a mapper 218 one with only 8KiB of EEPROM. If your test can be crammed into that, I can see what happens.


Top
 Profile  
 
PostPosted: Tue Jan 15, 2019 3:09 am 
Offline

Joined: Sat Nov 18, 2017 9:15 pm
Posts: 74
Alright, I've attached 8 and 16 KB mapper 218 versions of the stress test from the v2 zip. I didn't have any luck getting an 8 KB ROM to work in my current version of Mesen using the NES 2.0 exponent-multiplier format, so I used the 16 KB version for testing and it seems to work fine. The 8KB one does not contain a header, so it should be ready to throw on a chip. Note that this requires horizontal mirroring; unless this causes very severe corruption, sprite tiles #$01 and #$81 need to be different for the corruption to be visible.

I didn't bother converting the other test for now, but can if this causes corruption. The other has to do much more work in the NMI, which reduces the maximum window for loads during vblank; it's really just for finding what values trigger corruption and how long is required in vblank.


Attachments:
oam_corruption_stress_test_mapper218.zip [2.63 KiB]
Downloaded 165 times
Top
 Profile  
 
PostPosted: Tue Jan 15, 2019 6:28 pm 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8538
Location: Seattle
You will probably be dismayed to find out that the 64 sprites stayed on-screen, light blue, and nothing ever changed.

(I initially incorrectly rewired the cart, in the PPUA10 variant instead of the needed PPUA11 variant, and then there was garbage in the upper right corner and the sprites all stayed dark blue, but that's not surprising)


Approximately 25% of the time when I (re/)boot it, two sequential sprites are missing.


Top
 Profile  
 
PostPosted: Wed Jan 16, 2019 12:18 am 
Offline

Joined: Sat Nov 18, 2017 9:15 pm
Posts: 74
Well, it's an interesting result, nonetheless, and I definitely appreciate the testing. Right now, it's not clear to me if this behavior is something specific to the console or the cart being tested (or the combination). Notably, this has only been reproduced on the NES-style Everdrive N8 (3 out of 3 tested), with no glitching on a dev board, a PowerPak, and 2 Famicom-style Everdrive, so I need to try to do testing with the NES Everdrive and some other method on the same console. I'll look into putting together a dev cart, and will probably have an opportunity to test on PowerPak and Famicom Everdrive in June. Does it seem plausible to you that the cart could cause this?

Definitely good to hear that you at least had 2 missing tiles on some resets and even on boot; I'd been unable to test the latter case. This issue seems to happen regardless of the hardware. I'll try to figure out what difference between these tests and my first OAM decay test causes this.


Top
 Profile  
 
PostPosted: Wed Jan 16, 2019 11:03 am 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8538
Location: Seattle
Fiskbit wrote:
Does it seem plausible to you that the cart could cause this?
Not really, but I've at least heard of the Everdrive causing other weirdness in the past.

And since you point out it doesn't happen on the FC everdrive nor the powerpak, so there's something extra bonus weird on the N8.


Top
 Profile  
 
PostPosted: Sun Jun 16, 2019 11:22 pm 
Offline

Joined: Sat Nov 18, 2017 9:15 pm
Posts: 74
I got the opportunity to test this on some additional hardware a couple weeks ago. On my NTSC frontloader NES (NES-CPU-07, Rev G), both the NES and FC Everdrive N8s produced the same OAM corruption with oam_corruption_stress_test.nes, but the PowerPak did not produce any corruption. Notably, however, none of these same 3 flash carts produced any corruption on an A/V Famicom. I don't know if this means that model is immune to the problem or if that particular console is simply not susceptible to decay, since the severity of the decay seems to vary from one console to another. But, I think it's safe to say this decay issue is caused by something the Everdrive itself is doing.

The other interesting result was that the NES Everdrive on the A/V Famicom had substantial graphical problems in the form of bad slivers and IRQ mistiming. The latter, for example, caused the status bar to jump around in SMB3. The FC Everdrive didn't have any of these problems, though. The obvious difference is that the NES Everdrive had to be connected through an adapter, but the PowerPak on that same adapter had no problems at all, and no amount of cleaning or reseating seemed to make a difference for the Everdrive.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 40 posts ]  Go to page Previous  1, 2, 3

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group