Nintendo Jump tables

Are you new to 6502, NES, or even programming in general? Post any of your questions here. Remember - the only dumb question is the question that remains unasked.

Moderator: Moderators

User avatar
Movax12
Posts: 522
Joined: Sun Jan 02, 2011 11:50 am

Nintendo Jump tables

Post by Movax12 » Tue May 29, 2012 10:26 pm

This is from SMB and is pretty smart:

Code: Select all

OperModeExecutionTree:
      lda OperMode     ;this is the heart of the entire program,
      jsr JumpEngine   ;most of what goes on starts here

      .dw TitleScreenMode  ; <-- *
      .dw GameMode
      .dw VictoryMode
      .dw GameOverMode

;....
;later somewhere else in ROM:

;$04 - address low to jump address
;$05 - address high to jump address
;$06 - jump address low
;$07 - jump address high

JumpEngine:
       asl          ;shift bit from contents of A
       tay
       pla          ;pull saved return address from stack
       sta $04      ;save to indirect
       pla
       sta $05
       iny
       lda ($04),y  ;load pointer from indirect
       sta $06      ;note that if an RTS is performed in next routine
       iny          ;it will return to the execution before the sub
       lda ($04),y  ;that called this routine
       sta $07
       jmp ($06)    ;jump to the address we loaded

This is an interesting solution, it was used in other games too, but is using the stack really better than having a label to reference the pointers where I marked with an '*'? For example, you could load x and y with the high and low of the label and then call the jump engine. Or Is there a big advantage I am missing?

(Note there are multiple blocks of code that call the jumpengine, not just OperModeExecutionTree.)

User avatar
Memblers
Site Admin
Posts: 3799
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Post by Memblers » Tue May 29, 2012 11:55 pm

You could replace the PLA instructions with "LDA table,y" and it would be the same for that one example by itself. But using the stack like that would allow a unique table for every different time it's called. That works for SMB because it re-uses code and SMB has no PRG-ROM left. Even some PRG data is stored in CHR, and read out manually.

In the example of having X and Y contain the high/low of the label, seems like it'd be better to not put it there, but into zeropage. Which leaves nothing for the jump engine to do except jump.

User avatar
Movax12
Posts: 522
Joined: Sun Jan 02, 2011 11:50 am

Post by Movax12 » Wed May 30, 2012 5:03 am

Actually, I thought about it more..keeping the same structure, and using x,y for the table pointer (or zeropage) is okay, but then you would need two RTS or still use PLA,PLA,RTS to return to the code that first called into the jump table, so it might as well be done this way.

mcmartin
Posts: 7
Joined: Wed Sep 29, 2004 8:22 pm

Post by mcmartin » Wed May 30, 2012 7:08 pm

This is a fun trick - I associate it with OO-style method calls, myself. I first encountered it in Gradius, which has longer code but which preserves the (non-A) registers for the same 4 bytes of RAM:

Code: Select all

        asl                     ; A = A * 2
        stx     $9B             ; Cache X and Y
        sty     $9A
        tay
        iny                     ; Y = A + 1
        pla                     ; Put RTS's return address in $98
        sta     $98
        pla
        sta     $99
        lda     ($98), y        ; Y is the offset for the A'th address
        tax                     ; after the caller's JSR.  Read that
        iny                     ; address...
        lda     ($98), y
        sta     $99             ; And put it in $98-$99.
        stx     $98
        ldy     $9A             ; Restore arguments
        ldx     $9B
        jmp     ($0098)         ; Then jump there.
The calling convention is otherwise identical. I didn't get much further in my disassembly of Gradius, but that trick alone was worth the price of admission.

lidnariq
Posts: 9127
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Post by lidnariq » Wed May 30, 2012 8:49 pm

The exact same instructions from the SMB disassembly is also in Galaxian. (with different addresses)

User avatar
Movax12
Posts: 522
Joined: Sun Jan 02, 2011 11:50 am

Post by Movax12 » Wed May 30, 2012 9:41 pm

Apparently this style of code is in many NES games. Metriod has the code that preserves X and Y. My question was basically if this is really a good solution, or if whomever coded it was outsmarting themselves with cleverness, but and I suppose it is a decent way to solve that problem.

strat
Posts: 343
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Post by strat » Thu May 31, 2012 12:07 am

Super Mario Bros. style of dynamic jumping is repeated in Gameboy games, only the z80 allows the actual jumping to be done with registers.

From Balloon Kid:

Code: Select all

048A:
add a
pop hl
ld e,a
ld d,00h
add hl,de
ld e,(hl)
inc hl
ld d,(hl)
push de
pop hl
ld pc,hl

Shiru
Posts: 1161
Joined: Sat Jan 23, 2010 11:41 pm

Post by Shiru » Thu May 31, 2012 4:47 am

Strange, why they used that slow push/pop sequence instead of this:

Code: Select all

 ...
 ld a,(hl)
 inc hl
 ld h,(hl)
 ld l,a
 jp (hl)
Last edited by Shiru on Thu May 31, 2012 5:18 am, edited 1 time in total.

smkd
Posts: 101
Joined: Sun Apr 22, 2007 6:07 am

Post by smkd » Thu May 31, 2012 5:17 am

Movax12 wrote:Apparently this style of code is in many NES games. Metriod has the code that preserves X and Y. My question was basically if this is really a good solution, or if whomever coded it was outsmarting themselves with cleverness, but and I suppose it is a decent way to solve that problem.
It's not the fastest way to do it but it's really compact. Passing a pointer while also jumping to the dispatch code in only 3 bytes is pretty good. Memblers makes a good point with PRG space being starved. They would've been doing everything they can think of to save as much space as possible.

This appears in SNES games too. SMW uses the same trick, although it's 24bit instead. You just have JSL instead of JSR, LDA [$xx],y instead of LDA ($xx),y etc.

strat
Posts: 343
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Post by strat » Thu May 31, 2012 11:34 pm

Shiru: That's nothing. All graphics tiles in Balloon Kid are visible in a tile editor - in contrast, very few SNES games, not even Super Mario World, can be seen that way - and apparently the first screen of each stage (the first one at least) is also stored uncompressed. I plan on going back to disassembling it and it's going to be a let down if they used that 128k space to just store uncompressed level data.

strat
Posts: 343
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Post by strat » Thu May 31, 2012 11:49 pm

Shiru wrote:Strange, why they used that slow push/pop sequence instead of this:

Code: Select all

 ...
 ld a,(hl)
 inc hl
 ld h,(hl)
 ld l,a
 jp (hl)
Just for grins I swapped in that instruction sequence.

Maybe the programmer didn't think an indirect load from hl into h would work. It looks like at least with these early games the programmers didn't know everything about the chips they coded for. The Sprite 0 hit in SMB doesn't know 'bit' changes the V flag. And whoever re'd Metroid made fun of the NMI for saving the processor status.

tepples
Posts: 21874
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Fri Jun 01, 2012 3:51 am

strat wrote:Shiru: That's nothing. All graphics tiles in Balloon Kid are visible in a tile editor - in contrast, very few SNES games, not even Super Mario World, can be seen that way
Super Mario Kart object graphics are uncompressed, but then it uses Battletoads style sprite cel copying. Super Mario All-Stars tiles are uncompressed, but I guess it too needs them uncompressed to simulate an MMC3's CHR bankswitching with DMA to VRAM.
- and apparently the first screen of each stage (the first one at least) is also stored uncompressed. I plan on going back to disassembling it and it's going to be a let down if they used that 128k space to just store uncompressed level data.
(Excuse me for the apologetics; I was a big fan of Balloon Kid at one time.)
Mask ROM fabrication rounds up the ROM size to a power of two. If a game is 128 KiB uncompressed or 68 KiB compressed, and you lack ideas for bonus minigames to fill the extra space, why waste effort on compression? That's why I didn't compress the 3.5 KiB of scripts in the cut scenes of Thwaite: it wouldn't have saved enough to let me add the things I wanted to add while keeping it NROM-128 should I ever get around to making version 0.04.
The Sprite 0 hit in SMB doesn't know 'bit' changes the V flag.
Apart from the 6502's famous die-space efficiency, one reason why Nintendo chose it is because it was an unfamiliar chip (or "stone"), as 8080 family CPUs were more popular in Japan at the time than the 6502 used in Apple, Commodore, and Atari products. See page 2 of this interview.

strat
Posts: 343
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Post by strat » Fri Jun 01, 2012 11:43 am

That's really interesting, because I was also wondering why the Famicom didn't just use the same CPU as Donkey Kong. (Too bad they didn't go with the 65c02, if that was even out yet).
Normally, in porting Donkey Kong, the quickest way would have been to use the CPU in the arcade version. But Ricoh wanted us to use the 6502, which they had the license for. When I said I wanted to use the 6502 at Nintendo, the staff told me that I make such decisions because I didn’t make video games.

3gengames
Formerly 65024U
Posts: 2273
Joined: Sat Mar 27, 2010 12:57 pm

Post by 3gengames » Fri Jun 01, 2012 1:45 pm

^ Then the "game makers" ran with it because in the long run it would help because the Z80 sucked at cycle efficiency and programming ease. :P

tepples
Posts: 21874
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Fri Jun 01, 2012 3:31 pm

But then the 6502 needed faster memory that responded within a half cycle, while the Z80 allowed a cycle and a half. This allowed Z80s to be clocked faster with the same spec memory chips, making up for the lower cycle efficiency. Compare a 1.8 MHz Ricoh 6502 clone (NES) to a 4.2 MHz Sharp 8080 clone with some Z80 features (Game Boy).

Post Reply