It is currently Mon Feb 18, 2019 9:40 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 27 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Sat Jan 26, 2019 9:12 am 
Offline
User avatar

Joined: Fri May 08, 2015 7:17 pm
Posts: 2437
Location: DIGDUG
Also, the rendering Y scroll is incremented on the second write to 2006...(if you write to the screen during rendering)

http://wiki.nesdev.com/w/index.php/PPU_ ... 8w_is_1.29

...shifting the screen up 1 pixel below the place where the second 2006 write occurred.

_________________
nesdoug.com -- blog/tutorial on programming for the NES


Top
 Profile  
 
PostPosted: Sat Jan 26, 2019 11:32 am 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8139
Location: Seattle
Sour wrote:
In the code, the 2nd write to $2006 immediately changes VRAMAddr, and then writes to $2007 change the palette if VRAMAddr is still >= $3F00.
I did actually mention this above, but I didn't call it "VRAMAddr". In Visual2C02, it looks like it's explicitly the physical address on the address bus that controls whether reads come from and writes go to the palette. And although the second write to $2006 does immediately update "loopy_v", that value isn't what appears on the address bus during rendering.


Top
 Profile  
 
PostPosted: Sat Jan 26, 2019 8:30 pm 
Online
User avatar

Joined: Tue Jun 24, 2008 8:38 pm
Posts: 2210
Location: Fukuoka, Japan
@sour

I didn't meant to imply that one emulator may do it better than the other but more that the behavior was different and had artifact, like some people mentioned, when switching to a nametable with a different color. Which meant "maybe" nintendulator could have been closer to the nes hardware but unless that I test my own code on hardware myself, it was just speculation. Now from looking at the code, there seems to be no special processing and the behavior may be not appropriate too.

I guess this specific scenario only affect people that creates new games and need to do special effect (albeit the time to do it in hblank is quite short) so it's no priority to get it to work like the hardware right away. Still, the timed code part got half of it right so it was already a good start.

I'm curious what happens on real hardware when you forget to close the PPU (I guess nothing is written and garbage occurs?). Still, emulators today improved so much compared to 10 years ago that the case that we need to absolutely test on hardware to confirming the real behavior are becoming smaller, which I'm quite grateful of such a time saver ;)


Top
 Profile  
 
PostPosted: Sun Jan 27, 2019 9:44 am 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 617
lidnariq wrote:
I did actually mention this above, but I didn't call it "VRAMAddr". In Visual2C02, it looks like it's explicitly the physical address on the address bus that controls whether reads come from and writes go to the palette. And although the second write to $2006 does immediately update "loopy_v", that value isn't what appears on the address bus during rendering.
I was actually just referring to the variable name in Nintendulator's code here. A bit unrelated, but I'm curious how the bus actually works: during rendering (scanline -1 to 239), the PPU puts the address it needs to fetch the data on the bus, and any register writes will directly set the bus to that address (but the PPU's next fetch will be affected by the new value)? When scanline 240 begins (or when rendering is turned off), the bus returns to the current value of "VRAMAddr" and keeps that value until the prerender line? (unless changed by register writes)

So, because the bus' value is controlled by the PPU during rendering, and the PPU never fetches anything in the $3000+ range, it's impossible to write to the palette during this time. Whereas writes to $0000-$2FFF via $2007 are possible, but will (almost?) never write to the correct address, because writing to $2006 during rendering will update VRAMAddr, but will not have an immediate impact on the bus' current value (but it will have an impact on the bus' value the next time the PPU reads VRAMAddr to fetch a byte for rendering purposes)?

I'm sure I'm missing some details & oversimplifying, but is this more or less how it works? If so, I can probably make a few changes to my PPU code to prevent stuff like writes during HBlank from working properly without too much trouble.

Banshaku wrote:
I didn't meant to imply that one emulator may do it better than the other
No worries (and to be fair, Nintendulator does go more in-depth than Mesen when it comes to emulating the PPU's internal state), I'm just genuinely interested in making this more robust in Mesen if possible, especially since it would likely have no impact on performance.


Top
 Profile  
 
PostPosted: Sun Jan 27, 2019 12:19 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8139
Location: Seattle
Sour wrote:
during rendering (scanline -1 to 239), the PPU puts the address it needs to fetch the data on the bus
Yes.
Quote:
, and any register writes will directly set the bus to that address (but the PPU's next fetch will be affected by the new value)?
Typo? I assume you mean "will not directly set"
Quote:
When scanline 240 begins (or when rendering is turned off), the bus returns to the current value of "VRAMAddr" and keeps that value until the prerender line? (unless changed by register writes)
I believe that's accurate.

Quote:
So, because the bus' value is controlled by the PPU during rendering, and the PPU never fetches anything in the $3000+ range, it's impossible to write to the palette during this time. Whereas writes to $0000-$2FFF via $2007 are possible, but will (almost?) never write to the correct address, because writing to $2006 during rendering will update VRAMAddr, but will not have an immediate impact on the bus' current value (but it will have an impact on the bus' value the next time the PPU reads VRAMAddr to fetch a byte for rendering purposes)?
That also sounds accurate.



Quickly tracking down the PPU A8 signal in Visual2C02...

* It's synchronized on left half dots (t13798)

There's a five-way analog multiplexer behind this, selecting between
* Vcc during attribute fetches = node 1162 (+hpos_eq_0-255_or_320-335_and_hpos_mod_8_eq_2_or_3_and_rendering)
* vramaddr_+v8 during node 2047 (must be used for both nametable fetches and the inactive portion)
* node10730 during node 1275 (node 10730 is just a floating node... node 1275 is part of the "CPU writes to $2007" handler)
* node 9110 during node 1963 = pattern table fetch
** Node 9110 ultimately comes from another multiplexer,
** ultimately spr_d4, after nodes 1870, 328 (++hpos_eq_256_to_319_and_rendering), and a right half dot
** ultimately _db4, after a right half dot, and nodes 1870 and 1910 and another (the same) right half dot

In contrast, PPU A13 (still synchronized on left half dots) only has a two-way analog multiplexer, selecting between
* Ground during node 1963 (pattern table fetch)
* node 2042 during node 2047 or attribute fetches (1162)
** node 2042 is ultimately OR(vramaddr_+v13 , rendering_1)

PPU A12 is the one that does all the dirty work, because it has to be able to operate in the most different ways.
bkg_pat_out, spr_pat_out, spr_d0 (8x16 mode), NOR(vramaddr_/v12, rendering_1)



oh. ... huh, the timing glitches we saw on [$2000] & 3 or writes to $2005 and $2006 should also happen to [$2000] & $38, but only for a single sliver.


Top
 Profile  
 
PostPosted: Sun Jan 27, 2019 4:49 pm 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 617
lidnariq wrote:
Typo? I assume you mean "will not directly set"
Whoops, yea, I meant "will not" there.

So, I've been playing around with Visual NES a bit. It looks like the bus' address reverts to the current value of VRAMAddr on scanline 240, cycle 1. And then on the prerender scanline (-1) at cycle 1, it goes back to being whatever address the CPU is using for rendering (this is probably just the result of the PPU fetching the first NT byte)

I used a ROM that enables BG rendering only and keeps running this loop:
Code:
-:
  STX $2007
  INX
  JMP -
Here's what the memory looks like after running it for a good 15+ full frames:
Attachment:
VramTesting.png
VramTesting.png [ 54.77 KiB | Viewed 1028 times ]

lidnariq wrote:
oh. ... huh, the timing glitches we saw on [$2000] & 3 or writes to $2005 and $2006 should also happen to [$2000] & $38, but only for a single sliver.
Still haven't gotten around to implementing that one (or fully reading the thread, either), hoping to get to that relatively soon, though.


Top
 Profile  
 
PostPosted: Sun Jan 27, 2019 5:02 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8139
Location: Seattle
Sour wrote:
Still haven't gotten around to implementing that one (or fully reading the thread, either), hoping to get to that relatively soon, though.
Well, the shoot-through glitches only happen on certain alignments, so I'm not entirely certain when it's appropriate to emulate.


Top
 Profile  
 
PostPosted: Sun Jan 27, 2019 6:53 pm 
Offline

Joined: Sun Feb 07, 2016 6:16 pm
Posts: 617
Did some more testing and noticed that, in Visual NES, the code ended up writing proper values to CHR RAM starting at around $185 during vblank. I get something similar in Mesen (but it starts around $1A5 instead). I tried adding/tweaking the logic to suppress the V/H increments on write during rendering (e.g don't do them if the PPU is also doing them for its regular rendering on this cycle, like Nintendulator does), and while that does change the exact addresses that get written to, it usually ends up writing around $2185, instead (presumably because it did a few too many or too little vertical increments).

Either way, I think this is probably good enough - the real idea here is just preventing devs from being able to write to the palette/chr/nametables during HBlank without disabling rendering. The exact pattern created by the small test I wrote most likely varies based on CPU/PPU alignment and the like, anyway.

lidnariq wrote:
Well, the shoot-through glitches only happen on certain alignments, so I'm not entirely certain when it's appropriate to emulate.
That was actually one of my questions. I guess in that case it would make more sense for this to be an option, e.g "Emulate CPU-PPU alignment-specific glitches" or the like (and disabled by default, because nobody wants to see scroll glitches while playing Zelda!)


Top
 Profile  
 
PostPosted: Sun Jan 27, 2019 9:54 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8139
Location: Seattle
Sour wrote:
Did some more testing and noticed that, in Visual NES, the code ended up writing proper values to CHR RAM starting at around $185 during vblank. I get something similar in Mesen (but it starts around $1A5 instead). I tried adding/tweaking the logic to suppress the V/H increments on write during rendering (e.g don't do them if the PPU is also doing them for its regular rendering on this cycle, like Nintendulator does), and while that does change the exact addresses that get written to, it usually ends up writing around $2185, instead (presumably because it did a few too many or too little vertical increments).
I suspect it will smear the write across multiple bytes, for analog reasons.

The $2007-triggered rd cadence is aligned to pixels: ALE, idle, /RD, each for one aligned whole dot. However, it looks like $2007-triggered writes are not aligned. ALE is a left and right half dot; idle is just one left half dot; /WR is asserted for a right half dot but with the wrong value on the data bus, and then /WR is asserted for just a left half dot more with the correct value on the bus. (* note: this should be verified on hardware)

This should collide with the normal rendering cadence (left half dot: ALE. right half dot: idle. one aligned whole dot: /RD).

Internally to the PPU, if /RD is asserted, the multiplexed bus is high impedance, regardless of what the earlier multiplexers say should be presented.


Top
 Profile  
 
PostPosted: Mon Jan 28, 2019 2:26 am 
Online
User avatar

Joined: Tue Jun 24, 2008 8:38 pm
Posts: 2210
Location: Fukuoka, Japan
I may be getting closer to find the source of a bug that I had for a long time and it "may" be related to doing things in hblank. My question would be, it is possible that if I write to 2005 inside a MMC3 IRQ but not in hblank, weird things could occurs?

For example (it's not the complete code), if I comment out this part of code:

Code:
   bit PPU_STATUS
   lda #0                     ; location of scroll
   sta PPU_SCROLL


which is used to reset position of scroll X in second NT (nt was selected just before), now my palette refresh code inside NMI is working fine by waiting for NMI. I put this part of the code only, palette refresh doesn't work anymore.

It's an odd bug but I'm getting closer to the source of the error. Maybe there is something else triggering the bug but now, at the least, I know where inside IRQ that makes it fails. Before, I only knew something related to IRQ was causing it but not what.

edit:

Found my bug, at last: changing the scroll affect the latch, which means when it try to write to the palette, you need to reset the latch with bit PPU_STATUS. I didn't realize it was affecting the palette too but it just make sense because you change the address and it may end up in the wrong location. I feel ashamed to make such a simple mistake for a long time :lol:


Top
 Profile  
 
PostPosted: Mon Jan 28, 2019 12:18 pm 
Online

Joined: Sat Nov 18, 2017 9:15 pm
Posts: 68
Sour wrote:
I guess in that case it would make more sense for this to be an option, e.g "Emulate CPU-PPU alignment-specific glitches" or the like (and disabled by default, because nobody wants to see scroll glitches while playing Zelda!)

To my knowledge, the scroll glitching in Zelda (tested with split_scroll_test_v2) occurs regardless of alignment, though it looks like one of the alignments caused my writes to appear to take effect one dot early. The single-scanline issues seen in the ppu_xxxx_glitch tests are the ones that are only present on some alignments.


Top
 Profile  
 
PostPosted: Mon Jan 28, 2019 1:14 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 8139
Location: Seattle
Fiskbit wrote:
To my knowledge, the scroll glitching in Zelda (tested with split_scroll_test_v2) occurs regardless of alignment
Right. The shoot-through glitches are the ones that only happen on some alignments if the CPU write starts during the right half of dot 257.

The Zelda glitches are bus conflicts due to colliding increment and reload operations, and occur
1- to all 15 bits of both vramaddr_v and _t if the CPU's second write to $2006 finishes during dot 256.
2- to all six coarse horizontal position bits in both if the CPU's second write finishes during dot X where X%8 = 0 and (X≤256 or X=328 or X=336)


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 27 posts ]  Go to page Previous  1, 2

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group