NES games and differences in input reaction

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

User avatar
nesrocks
Posts: 563
Joined: Thu Aug 13, 2015 4:40 pm
Location: Rio de Janeiro - Brazil
Contact:

NES games and differences in input reaction

Post by nesrocks »

An edit to make things clear: there are two subjects on this post. Subject 1 is resolved. Subject 2 is a curiosity.

Subject 1:
So, I'm developing a demo using ca65. My goal is to add as many standard common features and then see what game I could make with that.

But one thing struck me that I thought maybe I was doing wrong, so I went on to test it. At the start of the nmi I do the OAM DMA transfer, meaning it would update sprites with their state on the previous frame. I thought maybe this was weird, but I think this is advised because of some timing issue (do DMA first). Well, for TESTING I tried this:

Code: Select all

NMI begin:
1 - read controllers
2 - game logic
3 - OAM DMA
4 - scroll
5 - game clock (just 16-bit increment to a reserved ram location)
6 - famitone update
7 - RTI
The problem is, this works in fceux, virtuanes and punes, but it fails on real hardware with everdrive n8, on nestopia and on mesen (no sprites appear on screen). So it is a bad idea (points up to nestopia and mesen). I then went on to test some games and see if any did this (I tested them before I tested my rom on the everdrive or other emulators). Of course, none did. I think this is related to how OAM decays quickly. So that test went nowhere, but I decided to share the following research.

Subject 2:
So, I made a list of screen reaction to controller input. I only tested in-game, as in, when actual gameplay was happening, not title screens or such. Also, I tried several different moves to see the one that was more responsive. Sometimes a character deliberately takes a frame or two before jumping. Sometimes it is inconsistent and seems to be related to how much is going on in the screen at the time.

The count is I pause fceux, hold a button and hit frame advance. When I see the reaction I count how many times I pressed frame advance. So 2 is the minimum.

Here are the results:
2: Super Mario bros 3, Adventure Island, Battle of Olympus, Zelda 2, Kick Master, Batman, Amazon Diet, Yo! Noid, TMNT 2, Contra, Castlevania, Chip and Dale, Battletoads, Donkey Kong, Ghosts'n Goblins (very inconsistent though), Ghostbusters (tested on the car scene), Adventure Island 3, Pac Man (tengen), Lode Runner, Street Fighter 2010, Super Mario Bros, Simpsons Bart vs Space Mutants, Captain Comic, Rockman (inconsistent), Rockman 2, Rockman 3, Rockman 4, Rockman 5, Rockman6, Metroid, M.C kids, Ninja Gaiden, Ninja Gaiden 2, Ninja Gaiden 3, Mario Bros, Balloon Fight (inconsistent)
3: Arkanoid, Yie Ar Kung-Fu, Castlevania 3, Double Dragon (very inconsistent, sometimes takes as many as 11 frames to punch), Dr. Mario, Super Pitfall (original and my hack), Kid Icarus, Gradius, Punch-Out, Elevator Action, Gauntlet, Holy Diver
4: Pinball
5: Tiger-Heli, TMNT
6: Double Dragon 2, Double Dragon 3

Now, out of the games that reacted in 2 frames, some of those I didn't test a lot, and many of them did start playing the sfx immediately. Out of the other games (3+ frames) I tested several different moves and situations to see if it would lower the response time, and I wrote the lowest I got.

I was really surprised to see the Double Dragon games perform the way they did.
Last edited by nesrocks on Wed Sep 13, 2017 9:21 am, edited 2 times in total.
https://twitter.com/bitinkstudios <- Follow me on twitter! Thanks!
https://www.patreon.com/bitinkstudios <- Support me on Patreon!
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: NES games and differences in input reaction

Post by Bregalad »

NMI begin:
1 - read controllers
2 - game logic
3 - OAM DMA
4 - scroll
5 - game clock (just 16-bit increment to a reserved ram location)
6 - famitone update
7 - RTI
This is an awful idea. VRAM updates and OAM DMA must happen during the short VBlank time, which happens directly after NMI.
User avatar
Myask
Posts: 965
Joined: Sat Jul 12, 2014 3:04 pm

Re: NES games and differences in input reaction

Post by Myask »

That you're getting 2 minimum suggests to me that their order is different than yours.
I in my current case I did NMI: DMA, VRAM [+scroll], controller, [sound], game logic, RTI.

Where is the PC stopped when you frame-advance? If it's at NMI entry (or thereabouts) like I expect, then in the case like mine, you can never get a "one frame" reaction, despite it technically reacting to your inputs on the very next frame it sees them: NMI, you update emu's input register, game updates video, game reads input and reacts, NMI, you have to press frame advance again, the game's reaction is now displayed.

(incidentally, I checked, and at least Zelda II does NMI: DMA, VRAM updates before doing anything with its controller variables, so…)

And yes, what Bregalad said.
User avatar
Memblers
Site Admin
Posts: 4044
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Re: NES games and differences in input reaction

Post by Memblers »

A little bird told me about an official document mentioning that DMA must begin within 286uS of beginning of NMI (which is about 512 CPU cycles, I think), or "there may be a problem with the DMA transfer". I always thought it was sensible to do the DMA early as possible, but it seems that it's actually required.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: NES games and differences in input reaction

Post by tokumaru »

Bregalad wrote:
NMI begin:
1 - read controllers
2 - game logic
3 - OAM DMA
4 - scroll
5 - game clock (just 16-bit increment to a reserved ram location)
6 - famitone update
7 - RTI
This is an awful idea. VRAM updates and OAM DMA must happen during the short VBlank time, which happens directly after NMI.
This is indeed a terrible program structure (which unfortunately appears to be disseminated by the Nerdy Nights tutorials), mainly because it puts the game logic, the slowest part of any game, above VRAM/OAM updates, meaning you basically have to do everything in just 20 or so scanlines, which's ridiculous. This is particularly evil because it appears to work at first, when there's little to no game logic, but as soon as anything more complex than a single moving sprite is implemented, things start to glitch, and less experienced programmers become understandably confused.

But even when done right (PPU updates first, everything else later), the "all in NMI" approach has its pitfalls, the main one being the handling of lag frames, which's possible but not very intuitive. Game states (different game loops) can also be slightly more annoying to implement without a main thread.
User avatar
dougeff
Posts: 3079
Joined: Fri May 08, 2015 7:17 pm

Re: NES games and differences in input reaction

Post by dougeff »

When I first started, I greatly underestimated how sensible Shiru's neslib was.

Nmi:
Oam update
Palette update (if needed) (from a buffer)
VRAM update (if needed) (from a buffer)
Set scroll
(Increase frame counter)
Music

Main loop:
(wait for frame counter to tick up)
Controller read
Game logic
Clear oam buffer
Fill oam buffer
Fill vram buffer, or just change a pointer to a new array
nesdoug.com -- blog/tutorial on programming for the NES
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: NES games and differences in input reaction

Post by Bregalad »

This is not just Shiru's neslib, but the standard way of doing thigngs. My game engine does it precisely like that (it also ticks the random number generator in NMI).
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: NES games and differences in input reaction

Post by tokumaru »

Bregalad wrote:This is not just Shiru's neslib, but the standard way of doing thigngs. My game engine does it precisely like that (it also ticks the random number generator in NMI).
I personally like the idea of ticking the RNG repeatedly in the main thread while waiting for vblank. Since each frame takes a variable amount of time to compute depending on a number of factors, the RNG is clocked a different number of times each frame, improving randomness. And it's not like you'd be using that time for anything else, so why not?
User avatar
nesrocks
Posts: 563
Joined: Thu Aug 13, 2015 4:40 pm
Location: Rio de Janeiro - Brazil
Contact:

Re: NES games and differences in input reaction

Post by nesrocks »

I posted the whole thing because I only realized it was fceux's fault later on, but still there's the interesting research on input lag from other games. I'd like to know why there is so much input lag on well programmed games. Is it some input buffer of sorts? For example the game Pinball, I can't imagine they decided "humm... this paddle is reacting too quickly, let's make it wait a few frames before it moves". How can double dragon 1 sometimes take 11 frames to react? This seems like a waste of memory to keep the input for so long?

Anyway, I knew this would happen, such a big post starting with a bad idea. I know it is a bad idea! I even said so in the post
nesrocks wrote:So it is a bad idea
The thing is, fceux doesn't seem to emulate OAM decay? So it doesn't tell me it is a bad idea. I had to test it on other stuff.

It shouldn't matter because the docs are pretty specific to do oamdma first. But I wanted to see what happened when I did otherwise anyway.

About my game loop, it has always been an all NMI approach with oam dma on the right place. I switched for this test only. But I do have a main loop, it just doesn't do anything. When I try to do as dougef said (which I had tried before), the game either runs too fast (if I put my game logic on the forever loop), or it doesn't show anything on screen (if I put it on the vblank wait loop). I haven't debugged this yet, I will, don't worry. I was just satisfied with the all nmi approach, but if it is not ideal I'll change it. edit: fixed it, now the game is on the main loop. I just need to create the palette buffer.
Last edited by nesrocks on Wed Sep 13, 2017 7:05 am, edited 1 time in total.
https://twitter.com/bitinkstudios <- Follow me on twitter! Thanks!
https://www.patreon.com/bitinkstudios <- Support me on Patreon!
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: NES games and differences in input reaction

Post by Bregalad »

tokumaru wrote: I personally like the idea of ticking the RNG repeatedly in the main thread while waiting for vblank. Since each frame takes a variable amount of time to compute depending on a number of factors, the RNG is clocked a different number of times each frame, improving randomness. And it's not like you'd be using that time for anything else, so why not?
Sounds like a very Konami-ish way to go ;)
About my game loop, it has always been an all NMI approach with oam dma on the right place.
There's nothing inherently wrong with an all-in-NMI approach, I just find it not very intuitive to program myself. For example if you want to fade the palete out, typically you want to darken the palette, wait a couple of frames, darken it again, etc... in a loop. You need to wait for the next NMI every time. Doing it with an all-in-NMI approach is hard, you need to detect that you are in a "fading palette" state, and then detect if you're in a waiting state or if the wait is over and you should darken the palette, or if everything is over and you continue. If done in the main code, the program counter does all that work automatically, but you have to do it manually. But if this is more intuitive or easier for you, by all means go for it.
User avatar
nesrocks
Posts: 563
Joined: Thu Aug 13, 2015 4:40 pm
Location: Rio de Janeiro - Brazil
Contact:

Re: NES games and differences in input reaction

Post by nesrocks »

Well... I already did program a fade system (fade in and fade out), and it happens every 2 frames (adjustable). I even use bit emphasis to add in between tones. This works right of the bat when I switched the logic to the main loop just now. edit: I guess this is working only because I am doing the code once and then waiting for vblank again, so it still is like NMI approach. I'll study this further.

Code: Select all

lda fadestate
beq loop ; when fadestate is zero that means there is no fading going on, branch to the logic loop
bne fadepal ; the fadepal manager will decide which fade it is, based on fadestate's value, and then set a new fadestate value if that fade is done. This also means that while fading is going on, no game loop is run
And I don't clear the whole OAM buffer before writing to it like dougeff said, I only clear the trailing bytes. So I don't write twice. I take the oam position and fill the rest with #$FF after I'm done filling it with actual sprites. And I am filling the buffer in a different order every frame so I already have sprite flicker covered (for 8 sprites on the scanline). I even have the main character use a reserved section of the buffer so he never flickers.
https://twitter.com/bitinkstudios <- Follow me on twitter! Thanks!
https://www.patreon.com/bitinkstudios <- Support me on Patreon!
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: NES games and differences in input reaction

Post by Bregalad »

Oh by the way I forgot to address that:
How can double dragon 1 sometimes take 11 frames to react?
Double Dragon games uses the button combination A+B to jump. It is almost impossible to require for the player to press those buttons at the same time, so when the game detect that A or B is pressed, very likely it waits a few frames to see if the player press the other button, if yes then the main character jumps, if no then it punches or kicks. 11 frames still sounds like quite a lot, are sure sure it's not less ? I'd say it's probably about 5 frames, although this must be verified. Also, one frame delay is compulsory because there's one frame delay between when the game decides to run the logic and when it's actually shown on screen.
User avatar
nesrocks
Posts: 563
Joined: Thu Aug 13, 2015 4:40 pm
Location: Rio de Janeiro - Brazil
Contact:

Re: NES games and differences in input reaction

Post by nesrocks »

Bregalad wrote:Double Dragon games uses the button combination A+B to jump. It is almost impossible to require for the player to press those buttons at the same time, so when the game detect that A or B is pressed, very likely it waits a few frames to see if the player press the other button
Very interesting, hadn't thought of that. The thing is, it doesn't just wait that long for A or B, it waits that long for any d-pad button too. That would have been unnecessary, imo.
Bregalad wrote:11 frames still sounds like quite a lot, are sure sure it's not less ?
As I said in the original post, it is at a minimum 3, and sometimes more, up to 11.
Bregalad wrote:Also, one frame delay is compulsory because there's one frame delay between when the game decides to run the logic and when it's actually shown on screen.
Yes, I already had understood this on the original post. See the list of games that take 2 frame advances to react.
https://twitter.com/bitinkstudios <- Follow me on twitter! Thanks!
https://www.patreon.com/bitinkstudios <- Support me on Patreon!
User avatar
dougeff
Posts: 3079
Joined: Fri May 08, 2015 7:17 pm

Re: NES games and differences in input reaction

Post by dougeff »

Re:input lag

Consider that some games might be doing acceleration at the sub-pixel level. So, it might only be moving the main character 0.3 pixels for the first few frames...making it appear like no movement has taken place.
nesdoug.com -- blog/tutorial on programming for the NES
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: NES games and differences in input reaction

Post by tokumaru »

I once came to the conclusion that the game logic could be simplified a bit if video rendering lagged behind by one frame. There are certain inconsistencies involving object interactions and the camera (e.g. the camera follows the player, that can have its position modified by obstacles, that need the final position of the camera to be drawn) that can be avoided if you rendering everything one frame late, when all objects are guaranteed to have their final positions and states. There are of course solutions that don't require any lag, but they often require more CPU time (e.g. visiting objects more than once per frame).
Post Reply