Page 1 of 2

DMC-fortified controller read routine

Posted: Thu May 22, 2008 5:02 pm
by blargg
Here's a controller read routine that is reliable even when the DMC is running. There is a safe version that takes 457 clocks, and a faster unsafe one that takes 137 clocks. Test ROMs + ca65 source:

read_joy.zip

The safe version reads three times in a row using the fast version, then compares the last two. If equal, that result is returned, otherwise the first read is returned. It doesn't need to check it in that case because the DMC can't clash with more than one of the reads, due to them being so closely spaced. The routine is timed so that the same number of clocks is used in each case.

The tests run the DMC at maximum rate and repeatedly read from the controller, printing an X when the DMC clashed with that read. A clash causes the fast version to give an erroneous result (left screenshot), but doesn't affect the safe version (right screenshot). Since the safe version compares two of the three reads it does, as expected it has twice the number of clashes (3.0% versus 1.3% for fast).

Image Image

Posted: Thu May 22, 2008 7:58 pm
by tepples
Interesting. Can this be adapted to read both controllers, or would that take so long as to make a glitch that hits two reads?

Posted: Thu May 22, 2008 8:52 pm
by blargg
Just make a second version that reads the other controller. Do games which use both controllers use some kind of optimized read-both-at-once routine or something? I imagine such a thing would suffer from the possibility of two corrupt reads, one during the first, and the other during the third. Even if this would work, it seems it'd only save around 120 clocks total.

Posted: Fri May 23, 2008 11:58 am
by dvdmth
The games I looked at read each controller separately. The controller is read multiple times until two identical results are returned for that controller.

I might point out, though it is not related to this discussion, that it is "proper" to read both bit 0 and bit 1 from each $4016/4017 access, combining the two to produce the final controller data. This is for the benefit of Famicom users who are using controllers that plug in to the expansion port instead of the normal ports. Every commercial game I've looked at acknowledges inputs on both bit 0 and bit 1 of $4016/4017.

Posted: Fri May 23, 2008 12:48 pm
by tepples
dvdmth wrote:Every commercial game I've looked at acknowledges inputs on both bit 0 and bit 1 of $4016/4017.
Even PAL games that were never released in NTSC-land?

Posted: Sat May 24, 2008 6:55 am
by ccovell
No, basically all JP-made games, and perhaps some (but not many) US/UK-developed games acknowledge the extra FC controller. This has made PowerPak playing on the Famicom a little bit annoying with a rapid-fire controller.

Posted: Sat May 24, 2008 9:52 am
by Jeroen
Remember kids, Pal =/= japanese. They use ntsc also known as crappyness or never the same color.

Posted: Sat May 24, 2008 10:13 am
by blargg
How can both bits be efficiently merged without taking more than 432 clocks to read the controllers three times? Or perhaps it can go over and avoid conflicts on more than one read by having the $4016 reads at just the right timing? That'd be more tricky to code and test thoroughly.

EDIT: Figured out how to OR both bits and only add 2 clocks per iteration! Just change the LSR A to AND #$03, CMP #$01. This general technique can OR any number of bits from A into carry. Just mask off the bits, then CMP #1 (to have the opposite carry - set if any of the masked bits are zero - CMP #mask instead).

Code: Select all

loop:
lda $4016   ; 4 bits 0 and 1 contain relevant data
and #$03    ; 2
cmp #$01    ; 2 carry = bit 0 OR bit 1
ror <temp   ; 5
bcc loop    ; 3

Posted: Sat May 24, 2008 12:47 pm
by dvdmth
I've seen different methods for combining bits 0 and 1, the best of which is the AND #3 : CMP #1 method already posted. I've also seen a pair of LSR instructions with a BCS in between (to skip the second LSR if bit 0 was set). Nintendo's games use a much less efficient way, something like:

Code: Select all

lda $4016
lsr
rol <$00
lsr
rol <$01
Then they OR bytes $00 and $01 after the read is complete. Interestingly, when they do their DPCM interference check, they only look at $00, not $01, so if nothing is plugged in the normal Famicom port (causing nothing but zeroes to be sent to $4016 bit 0), the interference check will never catch any errors.

Posted: Sat May 24, 2008 2:51 pm
by tepples
In blargg's algorithm, what happens if the player presses or releases a button sometime between read 1 and read 3?

Posted: Sat May 24, 2008 4:51 pm
by Anders_A
If the player pressed a key between pass 2 and 3 it would register as not pressed.

But really, this is probably going to be run every frame and a 1/60th of a second delay for input to register isn't going to be noticable.

Posted: Sat May 24, 2008 6:40 pm
by tepples
So what if there were a DPCM fetch in read 1 and a press before read 3? (I'm trying to determine why Nintendo didn't just use this algorithm.)

Posted: Sat May 24, 2008 8:31 pm
by dvdmth
Yeah, that is an issue. If all three read attempts give different results, there's no way to tell with certainty which was caused by the player pressing/releasing a button and which was caused by DMA interference, so you don't know which of the three to use. Probability of this happening is extremely rare, though.

Posted: Sat May 24, 2008 8:37 pm
by tepples
If A != B and B != C and A != C, just use the previous frame's presses. Would that work?

Posted: Sat May 24, 2008 8:42 pm
by blargg
Here's an updated version that should work correctly for Famicom external controllers. I thoroughly tested it at maximum DMC rate and it passes. The three reads now take longer than 432 clocks, but the reads are spaced so that the DMC cannot corrupt more than one. Inserting an 8+16*n (n >= 0) clock delay after the first read causes it to fail, confirming that they are spaced properly. I also added a controller button test.

read_joy2.zip

I also finally got around to doing the analysis of the case where the DMC corrupts the first read, and the controller input changes during the the second two reads. In this case, the corrupted first read will be returned, rather than one of the two correct reads from the controller.

There are 8 opportunities for DMC corruption of the first read, and the window for the controller change during the second two is 162 clocks. That means that the DMC corruption at maximum rate has an 8 in 29780 = 1 in ~3722 chance in a given frame, and the controller change a 162 in 29780 = 1 in ~184 chance in a given frame.

Assuming the controller were changing at a random time each frame, and the DMC were running, that makes the chance of both occurring in a given frame 1 in ~684848. So there would be an average of one of these errors every 3.2 hours at this worst-case setup. If the controller input were changing on average 10 times per frame, that would put one error every 19 hours.

I guess I'll try writing and analyzing a version that reads until two consecutive reads give the same value. The main problem is that this takes a varying amount of time, and would hang if someone fed the controller a very rapid turbo signal.