DMC-fortified controller read routine
Moderator: Moderators
DMC-fortified controller read routine
Here's a controller read routine that is reliable even when the DMC is running. There is a safe version that takes 457 clocks, and a faster unsafe one that takes 137 clocks. Test ROMs + ca65 source:
read_joy.zip
The safe version reads three times in a row using the fast version, then compares the last two. If equal, that result is returned, otherwise the first read is returned. It doesn't need to check it in that case because the DMC can't clash with more than one of the reads, due to them being so closely spaced. The routine is timed so that the same number of clocks is used in each case.
The tests run the DMC at maximum rate and repeatedly read from the controller, printing an X when the DMC clashed with that read. A clash causes the fast version to give an erroneous result (left screenshot), but doesn't affect the safe version (right screenshot). Since the safe version compares two of the three reads it does, as expected it has twice the number of clashes (3.0% versus 1.3% for fast).
read_joy.zip
The safe version reads three times in a row using the fast version, then compares the last two. If equal, that result is returned, otherwise the first read is returned. It doesn't need to check it in that case because the DMC can't clash with more than one of the reads, due to them being so closely spaced. The routine is timed so that the same number of clocks is used in each case.
The tests run the DMC at maximum rate and repeatedly read from the controller, printing an X when the DMC clashed with that read. A clash causes the fast version to give an erroneous result (left screenshot), but doesn't affect the safe version (right screenshot). Since the safe version compares two of the three reads it does, as expected it has twice the number of clashes (3.0% versus 1.3% for fast).
Just make a second version that reads the other controller. Do games which use both controllers use some kind of optimized read-both-at-once routine or something? I imagine such a thing would suffer from the possibility of two corrupt reads, one during the first, and the other during the third. Even if this would work, it seems it'd only save around 120 clocks total.
The games I looked at read each controller separately. The controller is read multiple times until two identical results are returned for that controller.
I might point out, though it is not related to this discussion, that it is "proper" to read both bit 0 and bit 1 from each $4016/4017 access, combining the two to produce the final controller data. This is for the benefit of Famicom users who are using controllers that plug in to the expansion port instead of the normal ports. Every commercial game I've looked at acknowledges inputs on both bit 0 and bit 1 of $4016/4017.
I might point out, though it is not related to this discussion, that it is "proper" to read both bit 0 and bit 1 from each $4016/4017 access, combining the two to produce the final controller data. This is for the benefit of Famicom users who are using controllers that plug in to the expansion port instead of the normal ports. Every commercial game I've looked at acknowledges inputs on both bit 0 and bit 1 of $4016/4017.
How can both bits be efficiently merged without taking more than 432 clocks to read the controllers three times? Or perhaps it can go over and avoid conflicts on more than one read by having the $4016 reads at just the right timing? That'd be more tricky to code and test thoroughly.
EDIT: Figured out how to OR both bits and only add 2 clocks per iteration! Just change the LSR A to AND #$03, CMP #$01. This general technique can OR any number of bits from A into carry. Just mask off the bits, then CMP #1 (to have the opposite carry - set if any of the masked bits are zero - CMP #mask instead).
EDIT: Figured out how to OR both bits and only add 2 clocks per iteration! Just change the LSR A to AND #$03, CMP #$01. This general technique can OR any number of bits from A into carry. Just mask off the bits, then CMP #1 (to have the opposite carry - set if any of the masked bits are zero - CMP #mask instead).
Code: Select all
loop:
lda $4016 ; 4 bits 0 and 1 contain relevant data
and #$03 ; 2
cmp #$01 ; 2 carry = bit 0 OR bit 1
ror <temp ; 5
bcc loop ; 3
Last edited by blargg on Sat May 24, 2008 6:33 pm, edited 1 time in total.
I've seen different methods for combining bits 0 and 1, the best of which is the AND #3 : CMP #1 method already posted. I've also seen a pair of LSR instructions with a BCS in between (to skip the second LSR if bit 0 was set). Nintendo's games use a much less efficient way, something like:
Then they OR bytes $00 and $01 after the read is complete. Interestingly, when they do their DPCM interference check, they only look at $00, not $01, so if nothing is plugged in the normal Famicom port (causing nothing but zeroes to be sent to $4016 bit 0), the interference check will never catch any errors.
Code: Select all
lda $4016
lsr
rol <$00
lsr
rol <$01
Yeah, that is an issue. If all three read attempts give different results, there's no way to tell with certainty which was caused by the player pressing/releasing a button and which was caused by DMA interference, so you don't know which of the three to use. Probability of this happening is extremely rare, though.
"Last version was better," says Floyd. "More bugs. Bugs make game fun."
Here's an updated version that should work correctly for Famicom external controllers. I thoroughly tested it at maximum DMC rate and it passes. The three reads now take longer than 432 clocks, but the reads are spaced so that the DMC cannot corrupt more than one. Inserting an 8+16*n (n >= 0) clock delay after the first read causes it to fail, confirming that they are spaced properly. I also added a controller button test.
read_joy2.zip
I also finally got around to doing the analysis of the case where the DMC corrupts the first read, and the controller input changes during the the second two reads. In this case, the corrupted first read will be returned, rather than one of the two correct reads from the controller.
There are 8 opportunities for DMC corruption of the first read, and the window for the controller change during the second two is 162 clocks. That means that the DMC corruption at maximum rate has an 8 in 29780 = 1 in ~3722 chance in a given frame, and the controller change a 162 in 29780 = 1 in ~184 chance in a given frame.
Assuming the controller were changing at a random time each frame, and the DMC were running, that makes the chance of both occurring in a given frame 1 in ~684848. So there would be an average of one of these errors every 3.2 hours at this worst-case setup. If the controller input were changing on average 10 times per frame, that would put one error every 19 hours.
I guess I'll try writing and analyzing a version that reads until two consecutive reads give the same value. The main problem is that this takes a varying amount of time, and would hang if someone fed the controller a very rapid turbo signal.
read_joy2.zip
I also finally got around to doing the analysis of the case where the DMC corrupts the first read, and the controller input changes during the the second two reads. In this case, the corrupted first read will be returned, rather than one of the two correct reads from the controller.
There are 8 opportunities for DMC corruption of the first read, and the window for the controller change during the second two is 162 clocks. That means that the DMC corruption at maximum rate has an 8 in 29780 = 1 in ~3722 chance in a given frame, and the controller change a 162 in 29780 = 1 in ~184 chance in a given frame.
Assuming the controller were changing at a random time each frame, and the DMC were running, that makes the chance of both occurring in a given frame 1 in ~684848. So there would be an average of one of these errors every 3.2 hours at this worst-case setup. If the controller input were changing on average 10 times per frame, that would put one error every 19 hours.
I guess I'll try writing and analyzing a version that reads until two consecutive reads give the same value. The main problem is that this takes a varying amount of time, and would hang if someone fed the controller a very rapid turbo signal.