Page 1 of 3

What is WRONG with my PPU???

Posted: Sun Sep 16, 2007 10:49 am
by NerveGas
I'm the guy working on NES.app, a Nintendo emulator for iPhone. I've recently adopted the old InfoNES core, which I've spent considerable time fixing up and have forked into a new project called NESCore, which I hope will eventually become a viable multi-platform NES core.

Anyway, I've gotten a lot working so far, but am struggling to figure out the PPU. I must say I'm confused, even after reading all the docs I could find on the subject. Rad Racer is still mangled, and mid-frame scrolling appears to be completely hosed. I've posted the code here:

http://svn.natetrue.com/nesapp/

If anyone would like to have a look at it, I would be glad to share some of the cash that my users have donated toward working on the project. I'll gladly send $250 to anyone who can help me get the PPU working the way it's supposed to. I've included relevant portions of my code below, but feel free to look in NESCore/M6502_rw.h for read/write functions and NESCore/NESCore.c for rendering (NESCore_DrawScanline).

Any help you guys can offer would be appreciated. I've been taking a crash course in emulation to try and make this useful for my users, would love to be able to fix this.

NG

Read Functions:
case 0x2000: /* PPU */
switch( wAddr )
{
case (0x2007): /* VRAM Read */
if (S.vAddr <0x3F00) {
wScratch = S.vAddr;
wScratch &= 0x3FFF;
bScratch = S.PPU_R7;
S.PPU_R7 = W.PPUBANK[ wScratch >> 10 ][ wScratch & 0x3FF ];
} else {
bScratch = W.PPUBANK[ wScratch >> 10 ] [ wScratch & 0x3FF ];
}

S.vAddr += (S.PPU_R0 & R0_INC_ADDR) ? 0x20 : 0x01;
S.vAddr &= 0x3FFF;


/* Mid-HBlank Update. If an address is written into $2006
followed by reads of $2007 during refresh, then the
address is loaded into the PPU and used as the start address
of the next scanline. This is used to scroll the background
vertically on a portion of the screen. This code converts
the scanline address into X/Y offsets. */

if (!(S.PPU_R2 & 0x80) && (S.PPU_R1 & 0x08)!=0) {
S.PPU_SCROLL_X = (S.PPU_SCROLL_X & 0xFF)
| ((S.vAddr & 0x400) >> 2);
S.PPU_SCROLL_Y = ( 480 + 480
+ (((S.vAddr & 0x800) >> 11) * 240)
+ ((S.vAddr & 0x3e0) >> 2)
- (S.PPU_Scanline + 1) ) % 480;

if (S.PPU_SCROLL_Y > 240)
S.PPU_SCROLL_Y -= 240;
NESCore_Develop_Scroll_Values();
}

return bScratch;
break;

case (0x2004): /* SPR-RAM Read */
return S.SPRRAM[ S.PPU_R3 ];
break;

case (0x2002):
S.PPU_Latch_Flag = 0;

if ((S.PPU_R0 & 0x80) && (S.PPU_R2 & 0x80)) {
S.PPU_R0 &= 0xFE;
S.PPU_SCROLL_X = S.PPU_SCROLL_X & 255;
}

return S.PPU_R2;
break;

default: /* $2000, $2001, $2003, $2005, $2006 */
return S.PPU_R7;
break;
}
break;


Write Functions:
case 0x2000: /* PPU */
switch ( wAddr )
{
case 0x2000: /* PPU Control Register 1 */
S.PPU_R0 = byData;
S.PPU_SP_Height = (S.PPU_R0 & R0_SP_SIZE) ? 0x10 : 0x08;
W.PPU_BG = (S.PPU_R0 & R0_BG_ADDR) ? S.ChrBuf + 0x4000 : S.ChrBuf;
W.PPU_SP = (S.PPU_R0 & R0_SP_ADDR) ? S.ChrBuf + 0x4000 : S.ChrBuf;
break;

case 0x2001: /* PPU Control Register 2 */
S.PPU_R1 = byData;
break;

case 0x2002: /* PPU Status - NOT WRITABLE */
break;

case 0x2003: /* Sprite RAM ADDR */
S.PPU_R3 = byData;
break;

case 0x2004: /* Sprite RAM DATA */
S.SPRRAM[ S.PPU_R3++ ] = byData;
break;

case 0x2005: /* Scroll Register */
if (S.PPU_Latch_Flag ^= 1) {
S.PPU_R5A = byData;
S.PPU_SCROLL_X = S.PPU_R5A;
} else {
S.PPU_R5B = byData;
S.PPU_SCROLL_Y = S.PPU_R5B;
if (S.PPU_SCROLL_Y > 240)
S.PPU_SCROLL_Y -= 240;
}
break;

case 0x2006: /* VRAM Address Register */
if (S.PPU_Latch_Flag ^= 1) {
S.PPU_R6A = byData;

if ( S.PPU_Scanline < 240 && (S.PPU_R1 & R1_SHOW_SCR) )

{
BITCOPY(S.PPU_R6RA, 0x01, S.PPU_R5B, 0x40)
BITCOPY(S.PPU_R6RA, 0x02, S.PPU_R5B, 0x80)
BITCOPY(S.PPU_R6RA, 0x08, S.PPU_R0, 0x01)
BITCOPY(S.PPU_R6RA, 0x10, S.PPU_R5B, 0x01)
BITCOPY(S.PPU_R6RA, 0x20, S.PPU_R5B, 0x02)
BITCOPY(S.PPU_R6RA, 0x40, S.PPU_R5B, 0x04)
BITCOPY(S.PPU_R6RB, 0x20, S.PPU_R5B, 0x08)
BITCOPY(S.PPU_R6RB, 0x40, S.PPU_R5B, 0x10)
BITCOPY(S.PPU_R6RB, 0x80, S.PPU_R5B, 0x20)
S.PPU_SCROLL_Y = S.PPU_R5B;
if (S.PPU_SCROLL_Y > 240)
S.PPU_SCROLL_Y -= 240;
}

S.vAddr_Latch = (S.vAddr_Latch & 0xFF)
| ((word) (byData & 0xFF) << 8);
}
else {
S.PPU_R6B = byData;

if ( S.PPU_Scanline < 240 && (S.PPU_R1 & R1_SHOW_SCR) )
{
BITCOPY(S.PPU_R6RA, 0x01, S.PPU_R5B, 0x40)
BITCOPY(S.PPU_R6RA, 0x02, S.PPU_R5B, 0x80)
BITCOPY(S.PPU_R6RA, 0x08, S.PPU_R0, 0x01)
BITCOPY(S.PPU_R6RA, 0x10, S.PPU_R5B, 0x01)
BITCOPY(S.PPU_R6RA, 0x20, S.PPU_R5B, 0x02)
BITCOPY(S.PPU_R6RA, 0x40, S.PPU_R5B, 0x04)
BITCOPY(S.PPU_R6RB, 0x20, S.PPU_R5B, 0x08)
BITCOPY(S.PPU_R6RB, 0x40, S.PPU_R5B, 0x10)
BITCOPY(S.PPU_R6RB, 0x80, S.PPU_R5B, 0x20)
S.PPU_SCROLL_X = S.PPU_R5A;
}

S.vAddr_Latch = (S.vAddr_Latch & 0xFF00)
| ((word) byData & 0xFF);
}
S.vAddr = S.vAddr_Latch & 0x3FFF;
break;


case 0x2007: /* VRAM Data */
{
wScratch = S.vAddr;
wScratch &= 0x3FFF;

S.vAddr += (S.PPU_R0 & R0_INC_ADDR) ? 0x20 : 0x01;
if (S.vAddr > 0x3FFF)
S.vAddr &= 0x3FFF;

if (wScratch < 0x2000 && S.VRAMWriteEnable)
{
/* Pattern Data */
S.ChrBufUpdate |= ( 1 << ( wScratch >> 10 ) );
W.PPUBANK[ wScratch >> 10 ][ wScratch & 0x3FF ] = byData;
}
else if (wScratch < 0x3F00 ) /* 0x2000 - 0x3EFF */
{
/* Name Table and Mirror */
W.PPUBANK[ (wScratch) >> 10 ][ wScratch & 0x3ff ] = byData;
W.PPUBANK[ (wScratch ^ 0x1000) >> 10][ wScratch & 0x3FF ] = byData;
}
else if (!(wScratch & 0xF)) /* 0x3F00 or 0x3F10 */
{
/* Palette Mirror */
S.PPURAM[ 0x3f10 ] = S.PPURAM[ 0x3f14 ] = S.PPURAM[ 0x3f18 ]
= S.PPURAM[ 0x3f1c ] = S.PPURAM[ 0x3f00 ] = S.PPURAM[ 0x3f04 ]
= S.PPURAM[ 0x3f08 ] = S.PPURAM[ 0x3f0c ] = byData;

S.PalTable[ 0x00 ] = S.PalTable[ 0x04 ] = S.PalTable[ 0x08 ]
= S.PalTable[ 0x0c ] = S.PalTable[ 0x10 ] = S.PalTable[ 0x14 ]
= S.PalTable[ 0x18 ] = S.PalTable[ 0x1c ]
= NesPalette[ byData ] | 0x8000;
}
else if (wScratch & 0x03)
{
/* Palette */
S.PPURAM[ wScratch ] = byData;
S.PalTable[ wScratch & 0x1f ] = NesPalette[ byData ];
}
}
break;
}
break;

Relevant portions of DrawScanline:

Exec6502(&S.m6502_state, STEP_PER_HBLANK);

/* Reload horizontal scroll bits at beginning of each scanline */

if ( S.PPU_Scanline < 240 )
{
BITCOPY_BACK(S.PPU_R6RA, 0x04, S.PPU_R0, 0x01)
BITCOPY_BACK(S.PPU_R6RB, 0x01, S.PPU_R5A, 0x08)
BITCOPY_BACK(S.PPU_R6RB, 0x02, S.PPU_R5A, 0x10)
BITCOPY_BACK(S.PPU_R6RB, 0x04, S.PPU_R5A, 0x20)
BITCOPY_BACK(S.PPU_R6RB, 0x08, S.PPU_R5A, 0x40)
BITCOPY_BACK(S.PPU_R6RB, 0x10, S.PPU_R5A, 0x80)
}

/* Reload vertical scroll bits at end of VBLANK */

if ( S.PPU_Scanline == 0 )
{
BITCOPY_BACK(S.PPU_R6RA, 0x01, S.PPU_R5B, 0x40)
BITCOPY_BACK(S.PPU_R6RA, 0x02, S.PPU_R5B, 0x80)
BITCOPY_BACK(S.PPU_R6RA, 0x08, S.PPU_R0, 0x01)
BITCOPY_BACK(S.PPU_R6RA, 0x10, S.PPU_R5B, 0x01)
BITCOPY_BACK(S.PPU_R6RA, 0x20, S.PPU_R5B, 0x02)
BITCOPY_BACK(S.PPU_R6RA, 0x40, S.PPU_R5B, 0x04)
BITCOPY_BACK(S.PPU_R6RB, 0x20, S.PPU_R5B, 0x08)
BITCOPY_BACK(S.PPU_R6RB, 0x40, S.PPU_R5B, 0x10)
BITCOPY_BACK(S.PPU_R6RB, 0x80, S.PPU_R5B, 0x20)
}

NESCore_Develop_Scroll_Values();

/* Render Background */
/* ================= */

/* MMC5 VROM Switch */
MapperRenderScreen(1);

pPoint = &S.WorkFrame[ S.PPU_Scanline * NES_DISP_WIDTH ];

if (!( S.PPU_R1 & R1_SHOW_SCR ))
{
/* Clear scanline if display is off */
memset( pPoint, 0, NES_DISP_WIDTH << 1 ); /* Assumes 16-Bit buffer! */
}
else
{
nY = S.PPU_SCROLL_Y_BYTE + (S.PPU_Scanline >> 3);
nYBit = S.PPU_SCROLL_Y_BIT + (S.PPU_Scanline & 7);

if ( nYBit > 7 )
{
nYBit &= 7;
nY++;
}
nYBit <<= 3;

nNameTable = NAME_TABLE0 + (S.PPU_R0 & 0x03);

/* Determine which Vertical Name Table we're in */
if (nY > 29)
{
nY -= 30;
nNameTable ^= NAME_TABLE_V_MASK;
}

nX = S.PPU_SCROLL_X_BYTE;
nY4 = ( ( nY & 0x02 ) << 1 );

Re: What is WRONG with my PPU???

Posted: Sun Sep 16, 2007 4:34 pm
by Josh
You shouldn't be keeping "Scroll X" and "Scroll Y" registers. All rendering should be done using the VRAM address, and the VRAM address should be updated from the latch as described in Loopy's document at the appropriate times. The only pure "scroll" value in the actual NES is the 3-bit fine X scroll offset.

Re: What is WRONG with my PPU???

Posted: Sun Sep 16, 2007 4:35 pm
by Josh
Do you have a version of this software that compiles on Win32 (2k/XP) with free software? If so, I can probably fix it for you. That's the only platform I have access to, however.

Posted: Sun Sep 16, 2007 5:02 pm
by NerveGas
Hey Josh

You could change up the InfoNES win32 code pretty easily to work with it - that's all I can think of.... if you wanted to have a whack at fixing it blindly, I'll gladly test it and send appropriate notes. I'll look into loopy's notes again, but I didn't get a whole lot out of that. I'm pretty new to NES emulation, I'm more interested in just making the iphone parts of it work well (e.g. the multitouch, directional sensor, etc) But please, feel free to send some patches and I'll give 'em a try. The complete code for the core can be downloaded from here:

svn co http://svn.natetrue.com/nesapp/src/NESCore

Posted: Sun Sep 16, 2007 5:09 pm
by NerveGas
I've seen several different formulas for computing X/Y from the vAddr - have you got a definitive one?

This is the one i'm using now (after putting loopy's stuff in)
SX = ((S.vAddr & 0x001F) << 3 )
| ((S.vAddr & 0x0400) >> 2 )
| (S.PPU_R5A & 0x7);
SY = ((S.vAddr & 0x03E0) >> 2 )
| ((S.vAddr & 0x0800) >> 3 )
| ((S.vAddr & 0x7000) >> 12 );

Not quite there yet, however

Posted: Sun Sep 16, 2007 5:24 pm
by Disch
(addr >> 12) & 7 <--- fine Y scroll
(addr >> 5) & 0x1F <--- coarse Y
addr & 0x1F <--- coarse X
(addr >> 10) & 1 <--- NT X
(addr >> 11) & 1 <--- NT Y

or imagine the PPU address as a 15-bit value:

[.yyy VHYY YYYX XXXX]
y = fine Y
V = NT Y
H = NT X
Y = coarse Y
X = coarse X


fine X scroll is stored seperately and is never changed or updated except on the first $2005 write.

EDIT - really though, you shouldn't need this except for converting $2005 writes.

If you keep the PPU address updated properly (increment by 1 every 8 cycles, etc), then all you have to do to fetch the proper tile is:

Code: Select all

tile = NameTableRead( 0x2000 | ( ppu_addr & 0x0FFF) );
chraddr = pattern_page | (tile << 4) | ((ppu_addr >> 12) & 7);

lo_chrbitplane = CHRRead( chraddr );
hi_chrbitplane = CHRRead( chraddr | 8 );
where 'pattern_page' is either 0x0000 or 0x1000 depending on which pattern table you're reading from.

Posted: Sun Sep 16, 2007 5:35 pm
by NerveGas
I think that's part of his problem - he seems to cycle all scanline cycles before drawing the scanline. I don't see an increment of vAddr here either, although I could add it. He's doing a lot of this in the code it looks like. Here's what he had with loopy's stuff:
READ:

case (0x2007): /* VRAM Read */
if (S.vAddr <0x3F00) {
wScratch = S.vAddr;
wScratch &= 0x3FFF;
bScratch = S.PPU_R7;
S.PPU_R7 = W.PPUBANK[ wScratch >> 10 ][ wScratch & 0x3FF ];
} else {
bScratch = W.PPUBANK[ wScratch >> 10 ] [ wScratch & 0x3FF ];
}

S.vAddr += (S.PPU_R0 & R0_INC_ADDR) ? 0x20 : 0x01;
S.vAddr &= 0x3FFF;

return bScratch;
break;

case (0x2004): /* SPR-RAM Read */
return S.SPRRAM[ S.PPU_R3 ];
break;

case (0x2002):
S.PPU_Latch_Flag = 0;

if (S.PPU_Scanline >= SCAN_VBLANK_START
&& (!(S.PPU_R0 & R0_NMI_VB)))
{
S.PPU_R0 &= ~R0_NAME_ADDR;
S.PPU_NameTable = NAME_TABLE0;
}

return S.PPU_R2;
break;

default: /* $2000, $2001, $2003, $2005, $2006 */
return S.PPU_R7;
break;
}


WRITE:

case 0x2000: /* PPU Control Register 1 */
S.PPU_R0 = byData;
S.PPU_SP_Height = (S.PPU_R0 & R0_SP_SIZE) ? 0x10 : 0x08;
S.PPU_NameTable = NAME_TABLE0 + (S.PPU_R0 & R0_NAME_ADDR);
W.PPU_BG = (S.PPU_R0 & R0_BG_ADDR) ? S.ChrBuf + 0x4000 : S.ChrBuf;
W.PPU_SP = (S.PPU_R0 & R0_SP_ADDR) ? S.ChrBuf + 0x4000 : S.ChrBuf;

S.vAddr_Latch = (S.vAddr_Latch & 0xF3FF )
| (( ((word) byData) & 0x0003 ) << 10);

break;

case 0x2001: /* PPU Control Register 2 */
S.PPU_R1 = byData;
break;

case 0x2002: /* PPU Status - NOT WRITABLE */
break;

case 0x2003: /* Sprite RAM ADDR */
S.PPU_R3 = byData;
break;

case 0x2004: /* Sprite RAM DATA */
S.SPRRAM[ S.PPU_R3++ ] = byData;
break;

case 0x2005: /* Scroll Register */
if (S.PPU_Latch_Flag ^= 1) {
S.PPU_R5A = byData;

S.vAddr_Latch = ( S.vAddr_Latch & 0xFFE0 )
| ((( (word) byData ) & 0xF8) >> 3);

} else {
S.PPU_R5B = byData;
if (S.PPU_R5B > 239)
S.PPU_NameTable ^= NAME_TABLE_V_MASK;

S.vAddr_Latch = (S.vAddr_Latch & 0xFC1F)
| ( ( ((word) byData ) & 0xF8 ) << 2);
S.vAddr_Latch = (S.vAddr_Latch & 0x8FFF)
| ( ( ((word) byData ) & 0x07 ) << 12);
}
break;

case 0x2006: /* VRAM Address Register */
if (S.PPU_Latch_Flag ^= 1) {
S.PPU_R6A = byData;

S.vAddr_Latch = (S.vAddr_Latch & 0x00FF )
| (( ((word) byData) & 0x003F) << 8);
}
else {
S.PPU_R6B = byData;
}

S.vAddr_Latch = (S.vAddr_Latch & 0xFF00)
| ( ((word) byData) & 0x00FF);
S.vAddr = S.vAddr_Latch;

if (!( S.PPU_R2 & R2_IN_VBLANK))
NESCore_Develop_Character_Data();
break;


So based on that, I've no idea what the X/Y values are... it looks like he also computes nY and nX from inside the drawing routine, based on X/Y... I'm wondering if that's right.

Re: What is WRONG with my PPU???

Posted: Sun Sep 16, 2007 6:25 pm
by Josh
When I have some free time this week, I'll take a closer look at the code. It doesn't look all that complicated, so I can probably get it working on Windows with SDL (not the best platform, but very simple to get something up and running without writing too much extra wrapper code).

By the way, you really should replace Marat's 6502 core, as it's not compatible with the GPL. Besides, there are much better cores out there. I think blargg has one written in C that works well (and is probably faster).

Posted: Sun Sep 16, 2007 6:28 pm
by NerveGas
Hmm Marat said taht the LGPL would be compatible (which is what I've got the core licensed as) but I'm all up for replacing it with something faster - send me a link if you've got one.

The latest is committed to SVN - any help you can offer would be appreciated, and I'm serious about throwing a few hundred bucks at someone who can fix this mess (I've got other things to worry about).

Posted: Mon Sep 17, 2007 2:56 pm
by NerveGas
Disch wrote:(addr >> 12) & 7 <--- fine Y scroll
(addr >> 5) & 0x1F <--- coarse Y
addr & 0x1F <--- coarse X
(addr >> 10) & 1 <--- NT X
(addr >> 11) & 1 <--- NT Y

or imagine the PPU address as a 15-bit value:

[.yyy VHYY YYYX XXXX]
y = fine Y
V = NT Y
H = NT X
Y = coarse Y
X = coarse X


fine X scroll is stored seperately and is never changed or updated except on the first $2005 write.

EDIT - really though, you shouldn't need this except for converting $2005 writes.

If you keep the PPU address updated properly (increment by 1 every 8 cycles, etc), then all you have to do to fetch the proper tile is:

Code: Select all

tile = NameTableRead( 0x2000 | ( ppu_addr & 0x0FFF) );
chraddr = pattern_page | (tile << 4) | ((ppu_addr >> 12) & 7);

lo_chrbitplane = CHRRead( chraddr );
hi_chrbitplane = CHRRead( chraddr | 8 );
where 'pattern_page' is either 0x0000 or 0x1000 depending on which pattern table you're reading from.
OK Put this in engrish for someone who doesn't speak much NES-geek. I understand from this how to calculate the scroll values, but it sounds like i'm not supposed to calculate them every scanline... so where do I do it? Just in $2005 writes? What's the difference between fine/coarse X/Y? I have an X/Y _BYTE and _BIT in my code which get calculated - is this what you're referring to? Right now it's calculating the bit off the byte.

And where does this fine x come from?

I'm doing my best to make sense of the PPU - I'm just not getting it apparently. I had no problem fixing all the broken stuff in the pAPU.

Finally, where should i be loading the name table? At the beginning of each scanline, and then flipping it? I tried grabbing it every clock cycle but that seemed to cause more problems. I 'm not clocking within the PPU rendering, so 84/85 clocks per scanline + 29 for hblank. That's around 1 clock every 3 pixels.

Making progress at least. I think my biggest issue is figuring out how to do the scroll values.

Posted: Mon Sep 17, 2007 3:23 pm
by Disch
This link may help:

http://nesdev.com/bbs/viewtopic.php?p=5578

There are no scroll registers in the NES. Instead, the PPU automatically adjust the PPU address as tiles are rendered (increments it by 1 after every tile fetch... wraps around nametable boundaries -- that kind of thing). From the sound of it, your problem isn't that you doing something at the wrong time, it's that you're doing the wrong thing overall (though the way you're doing it may be unavoidable due to you targetting lower performance systems -- in which case I doubt I'll be of any help)

Each scanline is 341 'dots' (or PPU cycles). On NTSC, there are 3 dots to every CPU cycle. Dots 0-255 each render 1 pixel. Every set of 8 dots (0-7, 8-15, etc) perform a tile fetch and during BG tile fetches, the PPU address is adjusted so that it points to the next tile.

I could probably do a better job explaining if I really examined the code you pasted -- but admittedly I didn't =x

Posted: Mon Sep 17, 2007 3:31 pm
by Dwedit
Rad Racer is a little tricky because it writes new values to the 'scroll registers' while the screen is rendering, before pixel clock 256. So the 'Y scroll value' will increase at pixel clock 256 and that applies to the next scanline.

(Is it pixel clock 256 when the 'Y-value' increments? Someone clarify this for me)

Posted: Mon Sep 17, 2007 4:20 pm
by Dwedit
All scrolling related values should be updated at the same time as the PPU address and ppu temp address.

If I make any mistakes in the explanation, please correct me. I'm pretty sure there are some mistakes in this. This is basically the result of looking at Loopy's scrolling document.

So we got these variables (my names)
Xscroll - this is what you see on the screen
Next_Xscroll - writes to 2000, 2005, 2006 affect this. The top 5 bits get copied to Xscroll's top 5 bits at the end of the scanline*.
Yscroll - This counts up after the end of a scanline*, and is set by the second write to 2006, and set at the first scanline.
Reset_Yscroll - At scanline 0, write this to yscroll*. Writes to 2000, Second write to 2005, Second write to 2006 affects this.

Writes to 2000 affect PPU temp address, Next_Xscroll and Reset_Yscroll
First write to 2005 affects the PPU temp address, Next_Xscroll, and the three lowest bits of Xscroll (Fine X Scroll)
Second write to 2005 affects the PPU temp address, and Reset_Yscroll
First write to 2006 affects the PPU temp address and Reset_Yscroll
Second write to 2006 affects the PPU temp address, PPU Address, Next_Xscroll, Xscroll, Yscroll, and Reset_Yscroll.

At pixel clock 256 (?), if the PPU is rendering (scanline is in visible range, and sprites or BG is enabled), Yscroll increments by 1. Top 5 bits of Next_Xscroll get copied to Xscroll.
At the start of the first scanline, if screen rendering is enabled, set PPU address to PPU temp address, write 5 the bits from next_xscroll, and set yscroll to reset_yscroll.

So yeah, this is basically sticking a different label on the exact same thing.

*Changes to the visible scrolling values only happen if the screen is rendering, or it's the second write to 2006.

Posted: Mon Sep 17, 2007 4:35 pm
by NerveGas
Lets do this the right way - forget I've got scroll variables... let me trash everything pertaining to scroll readers/writers... what's the best way to design this overall? I'm updating vAddr like this:

$2006(A)
S.vAddr_Latch = (S.vAddr_Latch & 0xFF)
| ((word) (byData & 0xFF) << 8);

$2006(B)
S.vAddr_Latch = (S.vAddr_Latch & 0xFF00)
| ((word) byData & 0xFF);
S.vAddr = S.vAddr_Latch & 0x3FFF;

So first off, have I got the correct vAddr setup? Also, it is incrementing it by 1 or 32 on a $2007 read or write.

Is this all correct? Am I missing any vAddr updates?

Once this is correct, moving onto the scrolling, lets say I'm writing $2005 writes directly to PPU_R5A and PPU_R5B registers, as well as 2006 writes (for posterity). When and how should I be calculating X/Y offsets based off this? The current DrawScanline routine uses a X_BYTE, X_BIT, Y_BYTE, and Y_BIT. I am assuming that's the course/fine settings - does that sound reasonable? It also calculated nY and nX based off of these values. I presume these are the tile numbers, as nY > 30 triggers a name table flip and -= 30. I can change any of this, just let me know what.

If all the above is correct, what values should I be computing, when, and how?

As for clocking, I am clocking 84/85 cycles + an additional 29 at HBLANK. This is being done at the time of rendering about every 3 pixels or so, so it should be timed correctly. Let me know if this is wrong... also which frames should be 84 and 85? even and odd? and is that based off of a zero-indexed scanline #?

Thanks so much for the help - I really want to do this right. Off to read more docs. I have a feeling it's just a matter of "doing the right thing" but I need to understand what the right thing is. Sorry to sound like such a dumb-ass. I'm normally a very proficient coder, but I'm failing to understand wtf is going on here.

Posted: Mon Sep 17, 2007 4:47 pm
by Dwedit
S.vAddr gets updated at pixel clock 256 (?) of a scanline if rendering is enabled, copies lowest 5 bits and bit 11.
S.vAddr = S.vAddr & 0x41F | (S.vAddr_Latch & 0x41F);

S.vAddr also gets completely updated at the start of line 0 if rendering is enabled (first visible scanline) (not completely sure of the timing!)
S.vAddr = S.vAddr_Latch & 0x3FFF;