A bunch of weird, intricate questions.

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

Post Reply
Posts: 1
Joined: Tue Dec 22, 2009 4:04 pm

A bunch of weird, intricate questions.

Post by BeTheDuck » Tue Dec 22, 2009 4:59 pm

I've been lurking around NesDev and the wiki for a few years now, and I'm finally trying to take a stab at writing an emulator. My goal is, of course, to get my information as correct as possible, even before I start hacking the thing together.

So I've been reading about the PPU and its insanity, and I have some weird questions. I'm not sure if these are even really important, but I'm really trying to go beyond "making an emulator that works" and into the realm of a working, accurate model of what the NES really does.

1. When the PPU goes into VBlank and sets off an NMI, when does the CPU catch it? I mean, let's say the CPU is running some instruction that takes 6 CPU clocks [like STA ($44),Y], and VBlank hits right as it starts running that instruction. Would the CPU wait to recognize the interrupt until the instruction is finished, fully 18 PPU cycles after it actually started? It seems like the obvious answer is "yes", but it also seems like being anywhere from 0 to 18 PPU cycles off during a precision technique would be a bad thing. Is that just something the programmer would have to deal with, and not the emulator?

2. Along the same lines, let's say I'm storing a value into a PPU register, something like "STA $2007". Now, that instruction takes 4 CPU cycles, or 12 PPU cycles. At what point does the actual value in VRAM change? I'll bet emulators just assume it changes after all 12 cycles are done, and I'll bet this is perfectly fine, but... I guess I don't want to enter into the PPU lightly and make assumptions everywhere. If this timing is different than I assume, it could change the performance of, say, a demo that runs perfectly fine on an NES.

3. In the same vein as #2, when reading a value in from the PPU (say, LDA $2007), on which PPU cycle would that read happen? The case I'm trying to account for here is, let's say the PPU is rendering the screen, and I read in from $2007 without altering the VRAM address myself. If the LDA takes 12 PPU cycles to complete, my VRAM address could be 2 or 3 different values depending on when the read really happens. (EDIT: I'm also assuming that somewhere in the PPUs VRAM reads, it increments the VRAM address, but I don't know which cycle THIS happens on either.)

4. I've mostly figured out the Loopy scrolling stuff, but there's a part of it that's bugging me. The original document has this shorthand:

Code: Select all

2006 first write:
...and from my understanding, Loopy_T can be interpreted in this format:


...where F is the fine y-scrolling (within a tile), N switches between the 4 name tables, and X and Y point to the current tile within the current name table. So if writing 00111111 to 2006 does what Loopy writes, those bits are written here:


...which alters the name table, the Y scroll, and... two bits of the fine y-scrolling. Is this right? Why would a write to $2006 affect the scroll within a tile? And if this is right, do those bits correspond to the highest bits in the data written, i.e. writing $30 (or $70, or $B0, or $F0) to $2006 would result in a fine-scroll value of 3?


I hope I haven't been proven myself TOO insane. Any help would be greatly appreciated.

Also, hi. This is without a doubt my most eloquent first post on anything, ever.

User avatar
Posts: 4435
Joined: Fri Nov 19, 2004 7:35 pm

Post by Dwedit » Tue Dec 22, 2009 7:11 pm

If you look at all the coarse scrolling bits of Loopy T and Loopy V, you will see that they are exactly the same as how you would calculate the address of a tile within the nametable. Coarse X + 32 * Coarse Y + Horizontal Nametable * 1024 + Vertical Nametable * 2048. Fine Y bits are just the other bits of the "vram address" being used as fine Y scroll values instead of a vram address.

Supposedly, the write would take place as the Write step of the instruction finishes.
This file tells you when the writes occur within each 6502 instruction: http://nesdev.com/6502_cpu.txt
Effectively, the write step is always the last step, except for JSR, BRK, and interrupts pushing to the stack.
Read-Modify-Write instructions do two writes for some reason, first the original value, then the modified value after the original value is written. The MMC1 mapper ignores the second write for those instructions.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!

Post Reply