Problem with nestest

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Vash wrote:When is frame rendered?
The PPU is constantly working alongside the CPU. It keeps repeating this cycle: 20 Vblank scanlines, 1 dummy scanline, 240 picture scanlines, 1 dummy scanline. It never stops. It's the responsibility of the game program to sync itself to this cycle.
User avatar
Vash
Posts: 21
Joined: Sat Oct 16, 2010 2:51 pm

Post by Vash »

Ok so the CPU and PPU are working in parallel but in my emulator I wanted to do something that looks like a game loop as this :

while(GAME_RUNNING)
{
if(timeElapsed>=tick)
Game.update();

Game.render();
}

So basically my question is when do I stop the CPU emulator to render a frame?
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

The most accurate method is to render three pixels, then perform one CPU cycle, and repeat. That's slow, so various catch-up schemes are used. As for when you give the GUI control, But the common pattern that I've seen is to stop the emulator at the start of line 240 (the post-render line), where 241 is the start of vertical blanking.
User avatar
Vash
Posts: 21
Joined: Sat Oct 16, 2010 2:51 pm

Post by Vash »

Ok so the emulator main loop can look like something like that :

Code: Select all

while(EMULATOR_RUNNING)
{
   if(cycle < 262) // number of scanline
      CPU.run(&cycle);

   PPU.render();
}
User avatar
Dwedit
Posts: 4922
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit »

There are 262 scanlines, but 341 PPU cycles in each scanline. More like 89342 PPU cycles total (29780.66... CPU cycles).

You need to expect the game to write to the PPU during rendering time, because even Super Mario Bros changes the scrolling location part way through draw time.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
User avatar
Vash
Posts: 21
Joined: Sat Oct 16, 2010 2:51 pm

Post by Vash »

The more I read stuff, the less I understand :D. I'm completely lost.

I'm ok with the cycle : 341 PPU cycle per scanline with 262 scanlines : 89342 PPU cycles. As 1 cpu cycle = 3 ppu Cycles, we end up with 29780 cpu cycles.

What do you mean by : the game write to the PPU during rendering time?
User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Post by Bregalad »

tepples wrote:The most accurate method is to render three pixels, then perform one CPU cycle, and repeat.
Are you racist against people living in PAL territories ?
Useless, lumbering half-wits don't scare us.
User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg »

Let's first adopt some terms that don't all sound the same. We don't need to use "cycle" for everything.

Cycle: CPU cycle. For example, two cycles in a NOP
Pixel: time PPU spends rendering a single pixel
Clock: the 21477272.7 Hz master timebase

Therefore:
1 clock = 1/21477272.7 second
1 pixel = 4 clocks = 1/5369318 second
1 cycle = 3 pixels = 12 clocks = 1/1789772.7 second
1 scanline = 341 pixels (in most cases) = 113.67 cycles
1 frame = 262 scanlines = 29780.67 cycles = 1/60.1 second

For PAL:
1 clock = 1/26601712.5 second
1 pixel = 5 clocks = 1/5320342.5 second
1 cycle = 3.2 pixels = 16 clocks = 1/1662607 second
1 scanline = 341 pixels = 106.5625 cycles
1 frame = 312 scanlines = 33247.5 cycles = 1/50 second

Then we can talk of these things with different one-word terms, and not get confused.
Last edited by blargg on Fri Oct 22, 2010 11:24 am, edited 1 time in total.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

Bregalad wrote:
tepples wrote:The most accurate method is to render three pixels, then perform one CPU cycle, and repeat.
Are you racist against people living in PAL territories ?
Russia is a PAL territory. The Dendy famiclone uses a /15 CPU instead of a /16 one like the official PAL NES, resulting in 3 pixels per cycle, and a PPU that makes NMI at scanline 291 instead of 241 like the official PAL NES. It appears the newbie hasn't yet appreciated the concept of two processors running in parallel. If I listed all variants of the NES architecture immediately, it would confuse the newbie even more.

For Dendy:
1 clock = 1/26601712.5 second
1 pixel = 5 clocks
1 cycle = 3 pixels = 15 clocks
1 scanline = 341 pixels = 113.67 cycles
1 frame = 312 scanlines = 35464 cycles
User avatar
Dwedit
Posts: 4922
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit »

Vash wrote: What do you mean by : the game write to the PPU during rendering time?
Here are the most common changes made to the PPU's state during rendering...

* Change the scrolling location for a status bar.
* Change the scrolling location many times because we want wave backgrounds or it's a racing game.
* Bankswitch the CHR so that different graphics are drawn after a certain scanline.
* Change which pattern table backgrouds and sprites use.

Then some more tricky stuff that games can do...

* Bankswitch the CHR more than once within the same scanline (Punch Out, Marble Madness, Fire Emblem, etc...)
* Disable rendering so the game can write to video ram, then re-enable rendering later within the same frame. (Wizards and Warriors 3)
* Disable rendering, then write a second sprite table, then re-enable rendering (Day Dreamin Davey, RC Pro am, Stunt Kids, some other games)

Any Renderer which looks at the PPU's initial state at the start of the frame (scroll position, CHR banks mapped in, which pattern tables to use, size of sprites) and attempts to draw the entire screen using only that initial state won't do a very good job, even Super Mario Bros won't scroll correctly.
You need at least scanline-level accuracy of PPU state changes. And then, scanline-level accuracy of PPU state changes isn't good enough for Punch Out, that needs pixel-level accuracy.

But you don't need to keep switching between CPU code and PPU code every instruction, you can instead use a catch-up method where you wait until the emulated game makes a PPU write, or the frame ends, then you draw that amount of pixels which have elapsed.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

It took a while, but I found an overview of "catch-up" and "timestamp" related techniques in this article on our wiki. Let me know about anything that you don't understand in this article so that I can go fix it.
3gengames
Formerly 65024U
Posts: 2284
Joined: Sat Mar 27, 2010 12:57 pm

Post by 3gengames »

Why don't you guys support emulation the core one cycle at a time, not just "x cycles=instruction y"...? Wouldn't that help the emulation alot for REALLY close timing things?
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

Running one cycle at a time is slow because it needs to keep the state of both the emulated CPU and PPU in the host CPU's L1 cache, and not all host CPUs are big enough for that. Efficient emulators use catch-up techniques to keep the host CPU's attention on only one emulated part at once yet still act as if the components run at the same time. Drop the catch-up, as you suggest, and you have an emulator like Nintendulator or bsnes, which last time I checked didn't run too well on netbooks.
User avatar
Zepper
Formerly Fx3
Posts: 3262
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Post by Zepper »

tepples wrote:The most accurate method is to render three pixels, then perform one CPU cycle, and repeat. That's slow, so various catch-up schemes are used. As for when you give the GUI control, But the common pattern that I've seen is to stop the emulator at the start of line 240 (the post-render line), where 241 is the start of vertical blanking.
- Absolutely true. :) I had to (additionally) create a queue system, right after an instruction, for things like switching to GUI or sound output updates/poll.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

Vash wrote:Ok so the emulator main loop can look like something like that :

Code: Select all

while(EMULATOR_RUNNING)
{
   if(cycle < 262) // number of scanline
      CPU.run(&cycle);

   PPU.render();
}
The problem with this approach is that many games (and I mean LOTS of them) make use of the fact that CPU and PPU run side by side. These games modify certain PPU parameters as the image renders in order to to change the rendered image in some way. This is used for status bars, parallax scrolling, color changes, things like that. If you ignore those timed changes and only render the image based on the final state of the PPU, almost every game will look wrong, and many might even hang (the ones that rely on sprite 0 hits).

A common solution to this problem is the "catch up" method. You basically run the CPU until the program tries to make any changes to the PPU or the frame ends, at which point you make the PPU catch up to the CPU by rendering the necessary number of pixels.

Of course you still have to consider events external to the CPU that might affect the program flow, such as sprite 0 hits or IRQs. You have to predict when those will happen so that you can update the system's state accordingly at the correct times.
Post Reply