As doppelgangers excellent disasm of Super Mario Bros 1 is based on the original Japanese/US version, I have been pondering on how to disassemble the European version, mainly to find all the differences compared to the Japanese version (which I suspect is a bunch of timing value differences, since it has to compensate from running 50fps instead of 60fps) but also to see if some other portions of code has changed.
The question is how to do this effectively (as we're talking about 32k of code/data tables, which is a lot!)?
My disassembling skills are somewhat limited in the knowledge of how powerful disassemblers such as IDA are actually in terms of separating data and code automatically, labeling stuff etc. I have a few years of 6502 coding experience though. I've just played around with different tools such as Y0shi's tracer.exe to see what source material doppelganger started out with.
Doing a hexdiff on both versions source (between $8000-$ffff) shows that much of the data are the same. The parts that do differ though are probably a lot less than what a simple hexdiff shows since all absolute jumps (or the values that the JumpEngine uses) or even whole subroutines will differ even if the actual code in the routines are the same. This because they are assembled a bit offset compared to the Japanese version. Not strange really but this makes things more complicated.
Ideally, having an editor with 3 open tabs, doppelgangers source, a traced version of the Japanese version vs the European version could be a starting point. As the doppelganger-version doesn't contain the absolute memory positions anymore (but will still assemble the same) it's hard to follow the link between these files.
So yeah, basically, how would you guys go about it? What ideas springs to mind and what tools would be worth looking into?
Beyond Compare does try to find similarities, even if assembled at different offsets. But... as I discovered with this, it doesn't try very hard by default. To fix this, click the little referee (rules) and change Comparison/Alignment to "Complete". After that it seems to do pretty well finding things that are off by a few bytes. It's very visual and will show areas with differences on a scroll bar. You can also show only differences, and go to the next different bytes with a hot key.
This is a report from the program about how different the files are:
Code: Select all
39722 same byte(s) 120 left orphan byte(s) 120 right orphan byte(s) 1134 difference byte(s)
If you can manage to get doppleganger's disassembly to assemble, most assemblers create a listing file that shows the address where each label was assembled to.
If not, you can try NESRevPlus to create a quick disassembly, and then compare it with doppleganger's.
But I've never done anything like this before, so who knows how helpful this was. Good luck, in any case.
For anyone interested in the workflow, I compiled doppelgangers source with x816 and use the LST-file that was generated (which contains memory adresses, hex values and comments). The LST-file helps me keeping track of where the differences should be noted in the actual source. I have this file and the original doppelganger smbdis.asm open in my editor.
I also prepared the european vs japanese binaries, stripped the headers and gfx/char and added a $8000 empty header, just to get the actual code starting at $8000 so it's easy to follow. Using Beyond Compare (thanks for the tip) I'm manually working through all the differ marks. Some of them I can easily discard by looking at the preceding opcode (often jmp and jsr-instructions with small offset addresses) but whenever I find something out of the ordinary, I swap to my editor and lookup that address in the LST-file, examine the code or data and note this in the doppelganger smbdis.asm.
Let's see how it goes.
- doppelganger SMB1 disasm. with EU-version added.
- (170.95 KiB) Downloaded 281 times
A behavior related to springboards in areas with large numbers of enemies was changed, possibly to fix a buffer overflow. This ate up two unused bytes before a table used by the looping parts of 4-4, 7-4, and 8-4, but it might cause springboards to fail to spawn. Might this have anything to do with certain springboards in C-3 and D-2 of SMB2 (J)? See List of glitches in SMB2 (J).
There is a change to how VerticalForce is calculated at the end of a spring. This fix ate up five bytes, one after the gravity code, and a bunch in the flying fish overhaul.
ChkInj has a change that I thought was for a collision fix related to Blooper but is instead for when enemies get reloaded from their starting positions.*
At ChkNearPlayer:* Bounding box for Blooper is adjusted.
There are changes to level data to keep the player from getting stuck at the exit pipe* of all three water areas (2-2 and 7-2, the bonus in 5-2 and 6-2, and 8-4).
One Paratroopa's starting position is changed in 8-2.*
Bashing blocks changes the player's vertical speed differently in water levels. Super Mario All-Stars keeps this change.*
At ContChk: A 1-pixel change to detecting collision with floors. (Not sure if this is related to game speed or a bug fix; Super Mario All-Stars has the same difference between NTSC and PAL versions.*)
An unused song header was affected by movement of code.
Otherwise, most of the changes appear to be related to the systems' vertical scan rates or possibly to subjective retuning. Some changes may affect the angles at which things move, which might change the most appropriate path through a level.
- NTSC has 21 frames per IntervalTimerControl cycle; PAL: 18 frames. This is important to tool-assisted speedrunners because level transitions happen on ITC cycles, meaning an improved speedrun must improve the old record on a level by a whole ITC cycle. Incidentally, Thwaite has a similar quirk in its engine, arising from the "tenths" timer that ticks every 6 frames on NTSC and 5 frames on PAL.
- The initial VerticalForceDown is greater in PAL.
- Force tables used by movement are different, but climbing speed is unchanged for some reason.
- Player animation timers are different, as is the run/walk threshold used by the skidding code.
- Fireball speed is faster. It looks like the angles might be slightly different too, especially the maximum downward speed.
- NTSC has 25 frames per clock tick; PAL has 21.
- Bullet Bill is faster, but not quite 20 percent faster.
- Enemies animate faster.
- Hammers are faster, and they might be falling a lot faster.
- Block bounce (after a ? block is hit) is faster, and possibly slightly faster than it needs to be.
- The base speed for normal enemies is faster.
- A running speed threshold related to Spiny is faster, and possibly slightly faster than it needs to be.
- Firebars spin faster.
- Fish are faster.
- The threshold at which the flying fish aim themselves differently is faster, but not quite 20 percent faster.
- Green Paratroopas are faster, as are Hammer Bros.
- Squashed Goombas disappear sooner.
- Bloopers swim less often, and it appears they wait for Mario to get closer.
- Flying fish vertical movement was simplified dramatically.
- A running speed threshold related to Lakitu is faster.
- Time between Bowser's fiery belches was reduced significantly more than the expected 17%: usually 25% to 33%, and their speed appears to have been increased as well.
- The platform in Coin Heaven where the player lands on it and it moves to the right is faster.
- Kicked shells move faster.
- It looks like they intended to decrease change the injury invincibility timer from eight units, as a 1-byte optimization (injury sound ID = 2 * old injury timer value) was removed, but I guess it was changed back because the programmer realized that this timer was in a block of timers measured in ITCs, not frames, and ITCs are already shorter on PAL. While investigating this, I see a further possible optimization that Nintendo missed: If GameTimerExpiredFlag is turned on before calling ForceInjury, a byte is saved.
- Koopa Troopas come out of their shells faster.
- The time before normal area changes is shorter.
- Change to springboard force.
- The time after entering a pipe is shorter.
- Enemy $09's bounding box is taller.
- Threshold for skid is faster.
- Coin grab tones are different.
- Note lengths and tone periods are different.
- Envelopes are NOT changed.
* Corrections per ShaneM on 2014-09-09
Yes, looks like it could be turned into a tail call. (Actually I'm not sure that doing that could save any ROM.) There are many many optimizations like that could be made. I suspect that they were just slightly over the size of nrom and optimized enough to make it fit (after storing some data in CHR).tepples wrote: If GameTimerExpiredFlag is turned on before calling ForceInjury, a byte is saved.
Most other games just fixed the music and raster effects, or didn't bother to fix anything at all.
Ouch. How much of that is program and how much is data, by the way? (curious)Movax12 wrote:I would speculate they did have to optimise just enough to get it fit. I say that because the PRG ROM uses every byte.