It is currently Tue Oct 17, 2017 6:29 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 7 posts ] 
Author Message
 Post subject: PPU scanline docs wrong?
PostPosted: Wed Sep 07, 2005 8:32 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
I've done some tests on my NTSC NES PPU's scanline times and found information that differs from what I've read. From my test I conclude that the first visible scanline begins around 7485 PPU clocks from NMI, and the right edge of the last visible scanline ends around 89250 PPU clocks from NMI. That is, there are 22 undisplayed scanlines at the start of VBL, rather than 21 and then a wasted scanline at the end of the frame. I also found that three PPU frames always total exactly 89342 CPU clocks, meaning that the average scanline duration is exactly 341 (rather than a hair less as I've read).

In my test the PPU is disabled and I have the VRAM address in the palette area to allow immediate feedback of $2007 accessess in the form of color change on screen. I figure the palette mapping is done at the final moment, so it has the least latency.

I wait for my NMI handler to be invoked so I get the most accurate beginning of a PPU frame. Then I wait 2486 clocks before accessing $2007 the first time, then 27255 more clocks and access $2007 again. This results in two color changes on screen, one just after the beginning of the first scanline, and the other just before the end of the last scanline. I then delay until 29781 clocks have passed (29780 every third frame) since the NMI and loop back (NMI is disabled at this point). The edges of the color changes wobble a bit, but they don't crawl over time. I also verified that my video capture is capturing all 240 scanlines.

This so strongly contradicts information I've read that I have to wonder what's going on.

Code:
nmi:                    ; 1-3 clocks latency
                        ; 7 clocks to vector nmi
      lda   #0          ; 2 disable nmi
      sta   $2000       ; 4
     
loop:
      ldy   #10         ; 2477 delay
      lda   #48         
      jsr   delay_ya0
     
      lda   $2007       ; 4 write occurs at VBL + 2495 (scanline 21.95)
     
      ldy   #34         ; 27251 delay
      lda   #159       
      jsr   delay_ya0
     
      lda   $2007       ; 4 write occurs at VBL + 29750 (scanline 261.73)
     
setup:
      lda   #$3f        ; 2
      sta   $2006       ; 4
      lda   #$01        ; 2
      sta   $2006       ; 4
     
      lda   #3          ; 2
      ldx   <delay      ; 3
      dex               ; 2
      beq   +           ; 2/3 (every third frame branch is taken)
      txa               ; 2
:     sta   <delay      ; 3
     
      pha               ; 16 delay
      pla
      pha
      pla
      nop
     
      jmp   loop        ; 3


Top
 Profile  
 
 Post subject:
PostPosted: Wed Sep 07, 2005 9:47 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
Addendum: The last (240th) scanline is always blank (sprites and background don't show up on it), thus you can only display 239 scanlines of normal graphics. When the PPU is enabled, it's the bg color. But there is a 240th scanline whose color can be manipulated to some extent.

I just noticed some junk scanlines on the bottom of the screen of my emulator, which prompted me to check this out. A few other tests I've run (since it's often easier to ask the NES than read a doc) regarding vertical scroll position and sprite positions show that the vertical scroll position you set via $2005 is effectively incremented by one, and that the sprite vertical position matches that on screen. In other words, if you write 0 to $2005 twice, the first scanline will show the second line of the background. If you write 0 to a sprite's Y position, its first line will be displayed on the first visible scanline.


Top
 Profile  
 
 Post subject:
PostPosted: Wed Sep 07, 2005 9:59 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 10:59 pm
Posts: 1389
What revision PPU were you testing?

Kevin Horton performed tests with an RP2C02G, and the results agree 100% with the existing timing information - that is, NMI for the duration of 20 idle scanlines, 1 blank-but-fetching scanline, 240 fetched-and-rendered scanlines, then 1 idle scanline before the next NMI. These tests were not done in a complete NES, but by manually manipulating the chip's input lines and observing its bus activity.

I also have a test ROM which proves that every other frame in NTSC is 1/3 of a CPU cycle shorter by displaying a white box using timed manipulation of the grayscale bit - it tested flawlessly on Kevin Horton's CopyNES.

http://www.qmtpro.com/~nes/cputime.nes
http://www.qmtpro.com/~nes/cputime.asm

_________________
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.


Last edited by Quietust on Fri May 15, 2009 9:55 am, edited 1 time in total.

Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 08, 2005 1:09 am 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
My NES has a 2C02G-0 video chip.

I guess my video capture doesn't capture the full NTSC frame. I hooked the video output of my NES to the audio input of my PC and was able to look at the video waveform in a sound editor. I could see that there was a BG scanline displayed on the 22nd scanline, as you stated. This solves the issue of what happens on each scanline (it happens as documented).

There's still the issue of frame length, and I realized that you were testing with the BG enabled. I used the grayscale bit as you mentioned, allowing me to test timing with the same loop regardless of whether BG was enabled. I found that the frame timing varies based on what's enabled:

With BG and/OR sprites enabled, 59561 CPU clocks kept it stable (appearing every other frame). Every scanline being exactly 341 PPU clocks except one every other frame would fit with this number.

With BG and sprites disabled, 89342 CPU clocks kept it stable (albeit appearing only one frame out of three). Every scanline being exactly 341 PPU clocks would fit with this number. This is what I was timing originally.

So apparently every other frame is one PPU clock shorter only when BG and/or sprites are enabled.


Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 08, 2005 3:02 am 
Offline
User avatar

Joined: Thu Mar 24, 2005 3:17 pm
Posts: 355
If your last point is the case, does the even/odd frame rule (the frame at boot being the even frame, then odd,even,odd,even,etc) still apply when the ppu has been disabled for a few frames ? Or will the odd frame after reenabling the ppu be 1 cycle shorter, no matter how many frames the ppu has been disabled, as if having an imaginary frame counter reset on ppu disabling ?


Top
 Profile  
 
 Post subject:
PostPosted: Thu Sep 08, 2005 10:41 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
That would be quite difficult to test. I have an idea for a test that turns the PPU off every other frame for hundreds of frames, then finds out how many clocks since power-up the next NMI occurs at. If I run this test on even frames, then odd frames, and get a significantly different result, then that confirms that the even/odd counter is maintained even when the PPU is off, otherwise it shows it's not maintained... or that the test has a bug or is simply flawed.

Of course if a difference is not found, then we'll want to know exactly when the even/odd frame bit is toggled.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Sep 25, 2005 4:50 pm 
Offline
User avatar

Joined: Mon Sep 27, 2004 8:33 am
Posts: 3715
Location: Central Texas, USA
I've finally been able to synchronize to exact PPU clocks in my test code. My preliminary result is that the even/odd flag is toggled every frame and unaffected by the PPU being turned on and off. At some point I'll have some test ROMs for this.

The way to synchronize with the PPU turned out quite simple. Turn the PPU off to get every frame at exactly 341 PPU clocks per scanline, resulting in frames 29780.66666 CPU clocks long. Roughly synchronize by polling $2002, delay slightly less than a frame, then keep checking $2002 every 29781 clocks. The VBL flag will be set at CPU clock intervals of 29781, 29781, 29780, 29781, 29781, 29780, etc. The loop starts out checking for the VBL flag slightly early, but runs at 29781 clocks per iteration, so the PPU slowly inch up and eventually start setting the VBL flag before the loop reads it, terminating the loop. At this point, the loop will have read the VBL flag the soonest after the PPU set it, and the code will be synchronized with the PPU. This opens the door to very precise tests.

Code:
      lda   #0          ; disable bg
      sta   $2001

      bit   $2002       ; clear vbl flag
:     bit   $2002       ; wait for vbl flag to be set
      bpl   -
                        ; delay slightly less than a frame
      ldy   #141        ; 29770 delay
      lda   #41         
      jsr   delay_ya2
     
                        ;
:     ldy   #86         ; 29774 delay
      lda   #68         
      jsr   delay_ya1

      bit   $2002
      bpl   -
                        ; code is now synchronized to PPU


I'm glad to have figured this out since it opens the door to very precise PPU test ROMs.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 7 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group