It is currently Sat Oct 21, 2017 2:29 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 74 posts ]  Go to page 1, 2, 3, 4, 5  Next
Author Message
PostPosted: Wed Aug 07, 2013 2:33 am 
Offline
User avatar

Joined: Sat Jan 03, 2009 3:28 pm
Posts: 59
Location: Oregon
Alright guys, bear with me please. I've read through all of the documents on the wiki and read through a ton of threads on here and there are still some things I just do not understand with the PPU. Maybe you guys can clear these things up for me.

1. How often are the PPU registers ($2000-$2007) looked at/updated? Does the PPU look at them all every cycle or every frame? Also, when are they checked? Eg. First thing at the start of a cycle/frame, or after a cycle/frame?
2. I'm still fuzzy on how some I/O registers work. For example $2007 is either a read or a write to PPU memory. From what I understand, the PPU checks $2006 twice to get the address that the CPU wants to read/write to and stores the address in a temporary VRAM address. Then the data written to $2007 is written into VRAM by the address specified in the temp address. After the write, the temporary address is incremented. If that's correct, then I get that part. How then, can $2007 be used as a read?
3. What is the base nametable that is referred to in register $2000? Does that just tell the PPU which nametable to read from first?
4. How does horizontal and vertical mirroring work? I've read multiple documents on this and I just don't understand how the nametables are arranged. Are the nametables physically mirrored, physically moved to different nametable locates in VRAM, or does the PPU just read from them in a different order? My original thought for horizontal was that nametable 1 would be on the left side (top and bottom) and nametable 2 would be on the right side (top and bottom). The wiki seems to tell me that all four nametables are used somehow.
5. My current idea on how to run my emulator would be to run the CPU and return the number of cycles the opcode that was executed took. Then I would take that information and run the PPU for 3 * cycles to catch up. For example, say some operation took 7 cycles, I would run the PPU for 21 cycles before executing another opcode. Is this an ok way of doing it?

This is all I have for now. The PPU has been worrying me since I started this project and it's proving to be quite confusing. I'm slowly getting there, though.

Thanks


Top
 Profile  
 
PostPosted: Wed Aug 07, 2013 6:30 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10066
Location: Rio de Janeiro - Brazil
Dartht33bagger wrote:
1. How often are the PPU registers ($2000-$2007) looked at/updated? Does the PPU look at them all every cycle or every frame? Also, when are they checked? Eg. First thing at the start of a cycle/frame, or after a cycle/frame?

Writes to PPU registers take effect immediately. The CPU and PPU run in parallel, as soon as the CPU performs a write to one of these registers, they are immediately sent to the PPU. The time it takes for the writes to affect how the picture is rendered might vary though... For example, the first write to $2005 (during rendering), will change the fine scroll immediately, but the coarse scroll will not change until the scanline ends. The second write to $2005 (Y scroll) simply doesn't do anything until the end of the frame. You can only emulate these effects correctly if you perform all the PPU tasks in the same order the PPU does it, in parallel with the CPU.

Quote:
2. I'm still fuzzy on how some I/O registers work. For example $2007 is either a read or a write to PPU memory. From what I understand, the PPU checks $2006 twice to get the address that the CPU wants to read/write to and stores the address in a temporary VRAM address. Then the data written to $2007 is written into VRAM by the address specified in the temp address. After the write, the temporary address is incremented. If that's correct, then I get that part. How then, can $2007 be used as a read?

Exactly the same way. Reads, just like writes, cause the address to increment. The only difference is that there's a delay when reading. When you read from $2007, a buffered value will be returned, and the contents of the address being read go into that buffer, so it will only be read on the next read. Games that read from $2007 will often throw out the first value read because of this. This delay does not happen when reading from the palettes though.

Quote:
3. What is the base nametable that is referred to in register $2000? Does that just tell the PPU which nametable to read from first?

Yes, that's the name table where rendering starts (i.e. the name table where the pixel that will show up at the top left corner of the screen is). Unless the scroll is (0, 0), you will see more than one name table at once.

Quote:
4. How does horizontal and vertical mirroring work? I've read multiple documents on this and I just don't understand how the nametables are arranged.

The NES has an addressing range of 4096 bytes dedicated to name tables (which are displayed as a 2x2 grid), but the NES has only 2048 of memory for this, so the other 2048 are mirrored. Games get to pick whether the 2 available name tables are arranged horizontally (and mirrored vertically) of vertically (and mirrored horizontally):

Code:
Vertical mirroring (top and bottom look the same):
A B
A B

Horizontal mirroring (left and right look the same):
A A
B B


Quote:
Are the nametables physically mirrored, physically moved to different nametable locates in VRAM, or does the PPU just read from them in a different order?

The PPU address lines are manipulated so that the PPU reads from different parts of the available 2KB of memory.

Quote:
My original thought for horizontal was that nametable 1 would be on the left side (top and bottom) and nametable 2 would be on the right side (top and bottom). The wiki seems to tell me that all four nametables are used somehow.

They are arranged that way when vertical mirroring is used. The PPU doesn't know that there are only 2KB, it thinks it's accessing 4KB of data, so as far as it's concerned there are 4 name tables. But since that extra 2KB don't exist, the 2KB that do exist are used twice. You should know that carts do have the option of disabling the internal 2KB of VRAM and supplying 4KB of its own for the name tables (this is called 4-screen "mirroring" - quotes used because there's no mirroring involved, despite the name).

Quote:
5. My current idea on how to run my emulator would be to run the CPU and return the number of cycles the opcode that was executed took. Then I would take that information and run the PPU for 3 * cycles to catch up. For example, say some operation took 7 cycles, I would run the PPU for 21 cycles before executing another opcode. Is this an ok way of doing it?

That should be OK for most games, but this is not how an actual console works. For example, if a program is waiting for a sprite 0 hit, it will constantly read $2002 waiting for the sprite hit flag to get set. If the LDA $2002 instruction starts before the sprite hit, and the hit happens before that instruction ends (i.e. within 12 pixels) the flag will not be set, like it should be. You'd have to run things cycle by cycle to catch this.


Top
 Profile  
 
PostPosted: Wed Aug 07, 2013 1:43 pm 
Offline
User avatar

Joined: Sat Jan 03, 2009 3:28 pm
Posts: 59
Location: Oregon
Thank you! Things are starting to make a lot more sense now. Two more quick questions.

1. How does the PPU know if $2007 wants a read or a write? Does the CPU write a nonzero value for a write and a zero value for a read?
2. When a $2007 read occurs, where does the data read go to? Does the PPU just send the read byte to $2007 for the CPU?
3. I want to try to get SMB1 to run since it has no mapper, but I know it's a tricky to emulate game. I also know it uses the sprite 0 hit flag. I'm assuming I need to use something like openMP to get the CPU and PPU running in parallel, but I'm stumped on how to run the CPU cycle by cycle. How do I, for example, split up an add with carry operation into mutiple steps so that only a little of the operation happens per cycle?

I won't work on question 3 for a while since I just want to get a game working first.


Top
 Profile  
 
PostPosted: Wed Aug 07, 2013 1:55 pm 
Offline
Formerly 65024U

Joined: Sat Mar 27, 2010 12:57 pm
Posts: 2257
1. There's a R/W pin on the CPU. When it's clocked, the pin is "read" and the action is done on the PPU, weather it be a read or write.

2. The byte first goes to a buffer and is not directly read out. All reads are delayed by one. I'm not exactly sure where or how it's stored, but that's how it works.

3. You do that by looking at the 6502 cycle-by-cycle operations for the instructions and implement them like that. :)


Top
 Profile  
 
PostPosted: Wed Aug 07, 2013 2:25 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 6292
Location: Seattle
3gengames wrote:
2. The byte first goes to a buffer and is not directly read out. All reads are delayed by one. I'm not exactly sure where or how it's stored, but that's how it works.

Except for reads from palette memory (Those aren't delayed). I don't know if any games care, though.


Top
 Profile  
 
PostPosted: Wed Aug 07, 2013 2:33 pm 
Offline
Formerly 65024U

Joined: Sat Mar 27, 2010 12:57 pm
Posts: 2257
Trust me games probably care, that's a big difference. But yep, that is right.


Top
 Profile  
 
PostPosted: Wed Aug 07, 2013 9:02 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10066
Location: Rio de Janeiro - Brazil
Dartht33bagger wrote:
1. How does the PPU know if $2007 wants a read or a write? Does the CPU write a nonzero value for a write and a zero value for a read?

What? Where did you get this zero/non-zero idea from? Like 3gengames said, one of the CPU pins indicates whether it's trying to read or write data, and the PPU uses that to tell reads and writes apart.

Fun fact: the Atari 2600 doesn't send the R/W signal to the cart, so carts with extra RAM use one address range for writing and another range for reading. For example, a cart with 256 bytes of RAM will make this memory writable at $1000-$10FF and readable at $1100-$11FF (it uses an address line to select between reading/writing).

Quote:
2. When a $2007 read occurs, where does the data read go to? Does the PPU just send the read byte to $2007 for the CPU?

When $2007 is read, the buffered value is sent to the CPU and the value read from the PPU goes to the buffer.

Quote:
I'm assuming I need to use something like openMP to get the CPU and PPU running in parallel

It doesn't have to be anything fancy, you can simply alternate between emulating the 2 chips. Things only get complex if you want both accuracy AND speed. Things get really simpler if you only care about one or the other.

Quote:
but I'm stumped on how to run the CPU cycle by cycle. How do I, for example, split up an add with carry operation into mutiple steps so that only a little of the operation happens per cycle?

There are documents such as this one (scroll down) that explain what happens on each cycle of the various instructions.

Quote:
I won't work on question 3 for a while since I just want to get a game working first.

Yeah, you shouldn't bother with cycle accurate emulation for now.


Top
 Profile  
 
PostPosted: Thu Aug 08, 2013 1:01 am 
Offline
User avatar

Joined: Sat Jan 03, 2009 3:28 pm
Posts: 59
Location: Oregon
Thank you!

One final question for now: How do I emulate the CPU pin for reads and writes on $2007? Do certain opcodes tell the CPU to read to write from that register?


Top
 Profile  
 
PostPosted: Thu Aug 08, 2013 7:36 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19115
Location: NE Indiana, USA (NTSC)
The most commonly encountered CPU instructions for this are LDA $2007 (read) and STA $2007 (write). There are others.


EDIT: mislead less


Top
 Profile  
 
PostPosted: Thu Aug 08, 2013 12:40 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10066
Location: Rio de Janeiro - Brazil
Dartht33bagger wrote:
How do I emulate the CPU pin for reads and writes on $2007? Do certain opcodes tell the CPU to read to write from that register?

Emulators don't usually emulate individual pins. This would actually be a good idea if it weren't so painfully slow.

You will have to emulate all the instructions one by one, so you absolutely MUST know whether an instruction is reading or writing. Emulators usually have a method that handles CPU writes (that all store instructions can call) and another one that handles CPU reads (that all load instructions can call), and these methods perform a range check to know what to do when particular addresses are accessed. If you detect that a write is being made to $2000-$2007 (or mirrors of that range) you call the appropriate PPU methods to process the write. The same goes for reads, and you pass along to the CPU whatever the PPU returns.


Top
 Profile  
 
PostPosted: Thu Aug 08, 2013 12:49 pm 
Offline

Joined: Fri Dec 30, 2011 7:15 am
Posts: 41
Location: Sweden
Speaking of $2007, there's something i'm unsure of. Looking at the "skinny on nes scrolling" page, I can see that X and Y scrolling updates every now and then during a frame, sometimes resulting in a wrap-around and nametable toggle. Does this apply to read and writes to $2007? Based on most text on the wiki, it seems like 1 or 32 just gets added to the vram address, with no wrapping or anything. Is that correct?

Oh, and this: "If rendering is enabled" - is this if certain bits are set in $2001? 0x1E? Or is 0xA enough (for BG rendering)?


Top
 Profile  
 
PostPosted: Thu Aug 08, 2013 1:03 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 10:59 pm
Posts: 1389
fred wrote:
Speaking of $2007, there's something i'm unsure of. Looking at the "skinny on nes scrolling" page, I can see that X and Y scrolling updates every now and then during a frame, sometimes resulting in a wrap-around and nametable toggle. Does this apply to read and writes to $2007? Based on most text on the wiki, it seems like 1 or 32 just gets added to the vram address, with no wrapping or anything. Is that correct?


Yes.

fred wrote:
Oh, and this: "If rendering is enabled" - is this if certain bits are set in $2001? 0x1E? Or is 0xA enough (for BG rendering)?

Enabling either background or sprite rendering is sufficient - even if only one is enabled, the PPU still does all of the work to render both (it just discards whichever one is turned off when it comes time to output the pixels themselves).

_________________
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.


Top
 Profile  
 
PostPosted: Thu Aug 08, 2013 1:05 pm 
Offline
User avatar

Joined: Fri Mar 08, 2013 9:55 pm
Posts: 349
Location: Linköping, Sweden
fred wrote:
Speaking of $2007, there's something i'm unsure of. Looking at the "skinny on nes scrolling" page, I can see that X and Y scrolling updates every now and then during a frame, sometimes resulting in a wrap-around and nametable toggle. Does this apply to read and writes to $2007? Based on most text on the wiki, it seems like 1 or 32 just gets added to the vram address, with no wrapping or anything. Is that correct?


Accessing $2007 during VBlank and when rendering is disabled (which means that neither background nor sprite rendering is enabled in $2001, i.e. that bits 3 and 4 are both zero) increments the address linearly by either 1 or 32. Accessing $2007 outside of VBlank with rendering enabled (this is seldom done) performs a glitchy update that takes its parts from the kinds of updates that are normally done during rendering.

Internally, the same register (loopy_v) is used both to hold the address for $2006/$2007 and during rendering to keep track of the current nametable location being rendered (along with fine x). Saves on hardware.


Top
 Profile  
 
PostPosted: Thu Aug 08, 2013 1:09 pm 
Offline

Joined: Fri Dec 30, 2011 7:15 am
Posts: 41
Location: Sweden
Ah, that clears it up. Thanks to you both!


Top
 Profile  
 
PostPosted: Thu Aug 08, 2013 6:25 pm 
Offline
Formerly Fx3
User avatar

Joined: Fri Nov 12, 2004 4:59 pm
Posts: 3064
Location: Brazil
tokumaru wrote:
Dartht33bagger wrote:
1. How does the PPU know if $2007 wants a read or a write? Does the CPU write a nonzero value for a write and a zero value for a read?

What? Where did you get this zero/non-zero idea from? Like 3gengames said, one of the CPU pins indicates whether it's trying to read or write data, and the PPU uses that to tell reads and writes apart.


Avoid technical or low level things. He's writing an emulator.

How does the PPU know if $2007 wants a read or a write?
- Firstly, you need the CPU 6502 program code. You should trap reads by LDA $2007 instruction, and writes by STA $2007. Look for LDA/STA timing diagram. Easy.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 74 posts ]  Go to page 1, 2, 3, 4, 5  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: Google Adsense [Bot] and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
cron
Powered by phpBB® Forum Software © phpBB Group