The PPU is constantly working alongside the CPU. It keeps repeating this cycle: 20 Vblank scanlines, 1 dummy scanline, 240 picture scanlines, 1 dummy scanline. It never stops. It's the responsibility of the game program to sync itself to this cycle.Vash wrote:When is frame rendered?
Problem with nestest
Moderator: Moderators
The most accurate method is to render three pixels, then perform one CPU cycle, and repeat. That's slow, so various catch-up schemes are used. As for when you give the GUI control, But the common pattern that I've seen is to stop the emulator at the start of line 240 (the post-render line), where 241 is the start of vertical blanking.
Ok so the emulator main loop can look like something like that :
Code: Select all
while(EMULATOR_RUNNING)
{
if(cycle < 262) // number of scanline
CPU.run(&cycle);
PPU.render();
}
There are 262 scanlines, but 341 PPU cycles in each scanline. More like 89342 PPU cycles total (29780.66... CPU cycles).
You need to expect the game to write to the PPU during rendering time, because even Super Mario Bros changes the scrolling location part way through draw time.
You need to expect the game to write to the PPU during rendering time, because even Super Mario Bros changes the scrolling location part way through draw time.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
Let's first adopt some terms that don't all sound the same. We don't need to use "cycle" for everything.
Cycle: CPU cycle. For example, two cycles in a NOP
Pixel: time PPU spends rendering a single pixel
Clock: the 21477272.7 Hz master timebase
Therefore:
1 clock = 1/21477272.7 second
1 pixel = 4 clocks = 1/5369318 second
1 cycle = 3 pixels = 12 clocks = 1/1789772.7 second
1 scanline = 341 pixels (in most cases) = 113.67 cycles
1 frame = 262 scanlines = 29780.67 cycles = 1/60.1 second
For PAL:
1 clock = 1/26601712.5 second
1 pixel = 5 clocks = 1/5320342.5 second
1 cycle = 3.2 pixels = 16 clocks = 1/1662607 second
1 scanline = 341 pixels = 106.5625 cycles
1 frame = 312 scanlines = 33247.5 cycles = 1/50 second
Then we can talk of these things with different one-word terms, and not get confused.
Cycle: CPU cycle. For example, two cycles in a NOP
Pixel: time PPU spends rendering a single pixel
Clock: the 21477272.7 Hz master timebase
Therefore:
1 clock = 1/21477272.7 second
1 pixel = 4 clocks = 1/5369318 second
1 cycle = 3 pixels = 12 clocks = 1/1789772.7 second
1 scanline = 341 pixels (in most cases) = 113.67 cycles
1 frame = 262 scanlines = 29780.67 cycles = 1/60.1 second
For PAL:
1 clock = 1/26601712.5 second
1 pixel = 5 clocks = 1/5320342.5 second
1 cycle = 3.2 pixels = 16 clocks = 1/1662607 second
1 scanline = 341 pixels = 106.5625 cycles
1 frame = 312 scanlines = 33247.5 cycles = 1/50 second
Then we can talk of these things with different one-word terms, and not get confused.
Last edited by blargg on Fri Oct 22, 2010 11:24 am, edited 1 time in total.
Russia is a PAL territory. The Dendy famiclone uses a /15 CPU instead of a /16 one like the official PAL NES, resulting in 3 pixels per cycle, and a PPU that makes NMI at scanline 291 instead of 241 like the official PAL NES. It appears the newbie hasn't yet appreciated the concept of two processors running in parallel. If I listed all variants of the NES architecture immediately, it would confuse the newbie even more.Bregalad wrote:Are you racist against people living in PAL territories ?tepples wrote:The most accurate method is to render three pixels, then perform one CPU cycle, and repeat.
For Dendy:
1 clock = 1/26601712.5 second
1 pixel = 5 clocks
1 cycle = 3 pixels = 15 clocks
1 scanline = 341 pixels = 113.67 cycles
1 frame = 312 scanlines = 35464 cycles
Here are the most common changes made to the PPU's state during rendering...Vash wrote: What do you mean by : the game write to the PPU during rendering time?
* Change the scrolling location for a status bar.
* Change the scrolling location many times because we want wave backgrounds or it's a racing game.
* Bankswitch the CHR so that different graphics are drawn after a certain scanline.
* Change which pattern table backgrouds and sprites use.
Then some more tricky stuff that games can do...
* Bankswitch the CHR more than once within the same scanline (Punch Out, Marble Madness, Fire Emblem, etc...)
* Disable rendering so the game can write to video ram, then re-enable rendering later within the same frame. (Wizards and Warriors 3)
* Disable rendering, then write a second sprite table, then re-enable rendering (Day Dreamin Davey, RC Pro am, Stunt Kids, some other games)
Any Renderer which looks at the PPU's initial state at the start of the frame (scroll position, CHR banks mapped in, which pattern tables to use, size of sprites) and attempts to draw the entire screen using only that initial state won't do a very good job, even Super Mario Bros won't scroll correctly.
You need at least scanline-level accuracy of PPU state changes. And then, scanline-level accuracy of PPU state changes isn't good enough for Punch Out, that needs pixel-level accuracy.
But you don't need to keep switching between CPU code and PPU code every instruction, you can instead use a catch-up method where you wait until the emulated game makes a PPU write, or the frame ends, then you draw that amount of pixels which have elapsed.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
It took a while, but I found an overview of "catch-up" and "timestamp" related techniques in this article on our wiki. Let me know about anything that you don't understand in this article so that I can go fix it.
Running one cycle at a time is slow because it needs to keep the state of both the emulated CPU and PPU in the host CPU's L1 cache, and not all host CPUs are big enough for that. Efficient emulators use catch-up techniques to keep the host CPU's attention on only one emulated part at once yet still act as if the components run at the same time. Drop the catch-up, as you suggest, and you have an emulator like Nintendulator or bsnes, which last time I checked didn't run too well on netbooks.
- Absolutely true. I had to (additionally) create a queue system, right after an instruction, for things like switching to GUI or sound output updates/poll.tepples wrote:The most accurate method is to render three pixels, then perform one CPU cycle, and repeat. That's slow, so various catch-up schemes are used. As for when you give the GUI control, But the common pattern that I've seen is to stop the emulator at the start of line 240 (the post-render line), where 241 is the start of vertical blanking.
The problem with this approach is that many games (and I mean LOTS of them) make use of the fact that CPU and PPU run side by side. These games modify certain PPU parameters as the image renders in order to to change the rendered image in some way. This is used for status bars, parallax scrolling, color changes, things like that. If you ignore those timed changes and only render the image based on the final state of the PPU, almost every game will look wrong, and many might even hang (the ones that rely on sprite 0 hits).Vash wrote:Ok so the emulator main loop can look like something like that :
Code: Select all
while(EMULATOR_RUNNING) { if(cycle < 262) // number of scanline CPU.run(&cycle); PPU.render(); }
A common solution to this problem is the "catch up" method. You basically run the CPU until the program tries to make any changes to the PPU or the frame ends, at which point you make the PPU catch up to the CPU by rendering the necessary number of pixels.
Of course you still have to consider events external to the CPU that might affect the program flow, such as sprite 0 hits or IRQs. You have to predict when those will happen so that you can update the system's state accordingly at the correct times.