Trying to tackle the PPU and timing

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

Post Reply
NewDietCoke248903
Posts: 7
Joined: Sun Apr 26, 2015 8:16 am

Trying to tackle the PPU and timing

Post by NewDietCoke248903 » Sun Apr 26, 2015 8:24 am

I'm writing an NES emulator and I've completed the CPU with cycle timing.

I'm now working on the PPU but having an issue trying to grasp how to actually do timing/drawing.

The CPU and the PPU have fixed frequencies. The frequency to on the CPU makes sense to me... The CPU executes instructions in memory, each having their own cycle counts.
The PPU, on the other hand, doesn't actually execute any instructions, yet it still has a clock rate. From the emulators perspective, I'm having difficulty trying to figure out what to do with the clock rate of the PPU, and tying that into drawing pixels on the screen.

My code is like this:

Code: Select all

while (1)
{
    int cpu_cycles = execute_next_cpu_instruction();
    wait_until_cpu_cycles_elapse(cpu_cycles);
}
But I dont understand where the PPU clock should tie into this. Am I approaching this wrong?

If anyone has any advice or articles that they can point me to, I'd greatly appreciate it. Thanks.

User avatar
tokumaru
Posts: 11991
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Trying to tackle the PPU and timing

Post by tokumaru » Sun Apr 26, 2015 8:32 am

1 PPU cycle is 1 pixel. The PPU doesn't read instructions from memory, but it certainly follows an internal sequence of pre-programmed tasks. This page describes what the PPU does on each cycle.

NewDietCoke248903
Posts: 7
Joined: Sun Apr 26, 2015 8:16 am

Re: Trying to tackle the PPU and timing

Post by NewDietCoke248903 » Sun Apr 26, 2015 9:41 am

tokumaru wrote:1 PPU cycle is 1 pixel. The PPU doesn't read instructions from memory, but it certainly follows an internal sequence of pre-programmed tasks. This page describes what the PPU does on each cycle.

Okay -- I read that section on the wiki. I'm a little confused at this (my interpretation from the wiki link):

Each pixel is drawn with 1 PPU cycle.

However, during each pixel it draws, it also performs memory accesses (NT, AT, TBL, TBH), each taking 2 cycles, for a total of 8 PPU cycles.

Those memory accesses it's performing is for future pixel plotting so that when its time to draw that pixel associated with those memory accesses, it can do it in 1 cycle.

So for example:

On the prerender scanline (-1), when drawing pixel 0, it does its memory accesses (NT, AT, TBL, TBH) for 8 PPU cycles to get the pixel data for visible scanline 0, pixel 0?

Because its 8 PPU cycles to get the pixel data (even though it starts a scanline early), I would imagine that it would creep up on the pixel that needs to be drawn and eventually stall, since it really requires 8 PPU cycles of memory access to get the pixel data. By the time it start pulling the pixel data for scanline 0, pixel 1, it's already on prerender scanline -1, pixel 8.

User avatar
Dwedit
Posts: 4408
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: Trying to tackle the PPU and timing

Post by Dwedit » Sun Apr 26, 2015 9:51 am

It actually starts a little earlier than pixel 0, it's cycles 321-336 of the previous scanline that get the first 16 pixels ready. Then cycles 337-340 are garbage fetches that don't matter. Then it's at cycle 0, and it starts outputting pixels it has stored (fine scroll determines which pixels it displays), and starts fetching the next tiles.

Prerender line is mostly junk that doesn't matter, except for the very end (cycles 321-336), which fetches the first 16 pixels that get drawn on the first visible scanline. Also at dots 280-304 of the prerender line, the scrolling event V=T happens every cycle.
First visible scanline doesn't have sprites either.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!

NewDietCoke248903
Posts: 7
Joined: Sun Apr 26, 2015 8:16 am

Re: Trying to tackle the PPU and timing

Post by NewDietCoke248903 » Sun Apr 26, 2015 10:07 am

Dwedit wrote:It actually starts a little earlier than pixel 0, it's cycles 321-336 of the previous scanline that get the first 16 pixels ready. Then cycles 337-340 are garbage fetches that don't matter. Then it's at cycle 0, and it starts outputting pixels it has stored (fine scroll determines which pixels it displays), and starts fetching the next tiles.

Prerender line is mostly junk that doesn't matter, except for the very end (cycles 321-336), which fetches the first 16 pixels that get drawn on the first visible scanline. Also at dots 280-304 of the prerender line, the scrolling event V=T happens every cycle.
First visible scanline doesn't have sprites either.
The four memory accesses:

Nametable byte
Attribute table byte
Tile bitmap low
Tile bitmap high (+8 bytes from tile bitmap low)

Are these four fetches for an individual pixel or individual tile? I had originally said pixel, but if it just does the fetches for the tile, then it has the all the data it needs to plot for a 8x8 tile in 8 cycles.

User avatar
Dwedit
Posts: 4408
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: Trying to tackle the PPU and timing

Post by Dwedit » Sun Apr 26, 2015 10:24 am

They are for a tile.
It still needs to have 16 pixels ready before it fetches stuff at cycle 0 of the scanline, because of fine scrolling and all that.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!

NewDietCoke248903
Posts: 7
Joined: Sun Apr 26, 2015 8:16 am

Re: Trying to tackle the PPU and timing

Post by NewDietCoke248903 » Sun Apr 26, 2015 12:15 pm

Okay -- Would this be a decent algorithm then:

Code: Select all

void start()
{
    while (1)
    {
        execute_cpu();
    }
}

void exec_cpu()
{
    if (cpu_instruction_is_about_to_require_ppu_mem_access())
    {
          execute_ppu(cpu_cycles_executed_before_this_instruction);
    }
    ....
}

void execute_ppu(int cpu_cycles)
{
    catch_up_and_draw_based_on_where_the_cpu_is(cpu_cycles);
    ....
}
Essentially, the CPU drives. We keep executing CPU instructions (and counting CPU cycles) until we discover that the next CPU instruction is going to affect the PPU. When that happens, we catch up the PPU to the same point as the CPU, right before it was about to execute that instruction.

If thats a valid algorithm could accuracy be an issue?

User avatar
tokumaru
Posts: 11991
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Trying to tackle the PPU and timing

Post by tokumaru » Sun Apr 26, 2015 6:54 pm

Keep in mind that the PPU also affects the CPU: NMIs and scanline IRQs are trigered by the PPU. The PPU status (VBlank flag, sprite hit flag and sprite overflow flag) can affect the program flow, but at least you know when the CPU is reading these flags.

And don't forget about the APU, which can generate interrupts too. For these reasons, I don't think it's safe to let the CPU run the show.

The safest thing to do would be to emulate one cycle of each component (CPU, PPU, APU, mapper, etc.) at a time. On today's desktop computers, an emulator like this would probably still run at full speed, but there are lots of other devices in use today that are not as powerful (phones and tablets, mostly).

In order to implement a proper catch-up method you probably have to keep a list of events that could affect other components of the system, and predict when those events will occur, and run the individual components until the next event, updating the list as you go. For example, you can predict when a scanline IRQ will fire based on the last parameters written to the mapper, so you can forget about that until it's time to handle that event, but if the CPU changes something that could affect the counting of scanlines (new mapper writes, new PPU configurations, etc.), you have to predict again.

Hopefully people who have actually written emulators will share their methods, I'm just here to tell you that there's more to consider. =)

Post Reply