Koei bytecode

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

AWJ
Posts: 433
Joined: Mon Nov 10, 2008 3:09 pm

Re: Koei bytecode

Post by AWJ »

Oziphantom wrote:Why not look at and extract the 16bit KOEI titles to see if they already have the above in 16bit?
Apparently their early SNES titles (e.g. Romance of the Three Kingdoms II) do use the same bytecode. Because the virtual machine has no support for an address space larger than 16 bits, they have to constantly copy bytecode and data overlays into RAM, simulating a bankswitched 64K machine.

I had a very brief look at a couple of later SNES Koei games (Uncharted Waters 2 and Genghis Khan 2) and they both look like compiler-generated 65816 code (tons of stack-relative addressing, etc.). So it looks like Koei switched to a different compiler at some point, one that produced native 65816 rather than bytecode. It's a bit interesting that the Famicom version of Genghis Khan 2 (Aoki Ookami to Shiroki Mejika: Genchou Hishi) is interpreted while the SNES version is native 65816.
AWJ
Posts: 433
Joined: Mon Nov 10, 2008 3:09 pm

Re: Koei bytecode

Post by AWJ »

I'm now investigating the assembly language initialization/NMI/syscall code, or "BIOS" as Koei apparently called it. In the MMC5 games there appear to be three major versions of the BIOS: one used in Nobunaga's Ambition 2, Bandit Kings, Ishin no Arashi and Rot3K2 (with minor changes per game); one used in Uncharted Waters and L'Empereur; and a final version used in Gemfire, Genghis Khan 2 and Nobunaga's Ambition 3 (aka Lord of Darkness on the SNES and Genesis).

The US versions of Nobunaga's Ambition 2, Bandit Kings and Rot3K2 are actually kind of hybrids between BIOS version 1 and 2: the RAM addresses they use are version 1 (a ton of variables got shuffled in and out of zero page between the versions--I guess someone did some profiling :)) but a number of code changes from version 2 were backported in. The backported changes appear to be mainly sound related--version 1 uses the MMC5 channels exclusively for melodic sound effects, but of course those channels aren't usable on a stock NES, so they had to backport the version 2 sound engine which can share channels between BGM and effects in order to move those sound effects to the built-in APU channels.
CaH4e3
Posts: 71
Joined: Thu Oct 13, 2005 10:39 am

Re: Koei bytecode

Post by CaH4e3 »

oh, lol... accidentally I'm doing the same re work for a couple of week already, but haven't seen this topic by now...
I was going to write some kind of documentation too, but seems this is not needed anymore. here is a complete info already, better than I can write myself. I even got here some fixes to my custom disasm scripts...

as a side note, I'd like to mention, KOEI's C compiler (strong evidence this is actually a C code is here https://tcrf.net/Aerobiz_Supersonic_(SNES)) has an ability to produce the native 6502 code as well as bytecode at least since 1988. not only the SNES version of uncharted waters 2 has this auto generated native code, but almost every NES title. usually, most of the native code you could see in the NES games by KOEI is autogenerated. except the interpretator itself, bios and some helpers. the rest is for sure autogenerated, maybe only with later manual polishing... looks like they wanted to make some portions of code work faster, or even did some profiling for most frequently used functions. seems they just couldn't compile all the code in native 6502 since it uses a lot more space. the game with maximum autogenerated native code is L'Empereur, almost half of scrips compiled as native code.

if someone interested, I can provide some practical tools to decompile these scrips (ida loader+scripts). doubt I will recreate some kind of KOEI's compilator. but at least I can provide a solution to reassemble the existing code with all links and functions. with using some asm macroses, you could rewrite all bytecode in a nice looking readable and editable sourses. maybe for those who wanted to translate some japaneseonly games (secrets and unused debug already found by me lol, the main reason I started to do all this thing myself)...
AWJ
Posts: 433
Joined: Mon Nov 10, 2008 3:09 pm

Re: Koei bytecode

Post by AWJ »

CaH4e3 wrote:if someone interested, I can provide some practical tools to decompile these scrips (ida loader+scripts). doubt I will recreate some kind of KOEI's compilator. but at least I can provide a solution to reassemble the existing code with all links and functions. with using some asm macroses, you could rewrite all bytecode in a nice looking readable and editable sourses. maybe for those who wanted to translate some japaneseonly games (secrets and unused debug already found by me lol, the main reason I started to do all this thing myself)...
I don't own or use IDA (I've written my own extensible disassembler instead) but I'm interested in seeing your decompiler script anyway.

Did you discover if the apparently redundant bytes in the stack frame are used for anything?
CaH4e3
Posts: 71
Joined: Thu Oct 13, 2005 10:39 am

Re: Koei bytecode

Post by CaH4e3 »

so, the scripts mostly useless for you without ida. they firmly depends on my other custom disasm scripts. but I may try to make some demo what they do and what I can do with it...

and no, I haven't digeed so much in stack to check if there any oddities ;) i'm sure, any oddities, is just a genuine "bugs" or redundancy of the compiler. the bioscall always copy 22 bytes of stack in arguments buffer in any case, even if there is only one or two args (so maybe because of that they did some fast versions of bios call using only first arg to switch banks or something..)
AWJ
Posts: 433
Joined: Mon Nov 10, 2008 3:09 pm

Re: Koei bytecode

Post by AWJ »

CaH4e3 wrote:so, the scripts mostly useless for you without ida. they firmly depends on my other custom disasm scripts. but I may try to make some demo what they do and what I can do with it...

and no, I haven't digeed so much in stack to check if there any oddities ;) i'm sure, any oddities, is just a genuine "bugs" or redundancy of the compiler. the bioscall always copy 22 bytes of stack in arguments buffer in any case, even if there is only one or two args (so maybe because of that they did some fast versions of bios call using only first arg to switch banks or something..)
Even if I can't run your scripts without IDA, I can port them to my own disassembler :)

Well, I've found one use for the "hole" at frameptr+8. Your tip that there was also compiler-generated native code was just what I needed. Take a look at the routine at 1F/E24B in Ishin no Arashi, 1F/E37A in Rot3K2 or 1F/E2E3 in L'Empereur (the same routine exists in BKoAC and probably most/all of the games, but it's not always in the fixed bank. I guess the games that have it in the fixed bank are the ones with the most compiler-generated native code)

The purpose of this routine seems to be to create a bytecode-compatible stack frame for a native-code function, and to clean up that stack frame when that function exits (it does some 6502 stack manipulation to wedge its cleanup code after the RTS of the "wrapped" function). Anyway, this wrapper routine uses the byte at frameptr+8 to store the number of bytes it has copied from $0080 to the bytecode stack, in order to restore that number of bytes back to $0080 when the wrapped native-code function exits.

It looks like compiler-generated native code uses $0080~ as local variable storage, and each native-code function uses this wrapper to preserve its caller's local variables (if any) and to allocate stack space for the arguments it needs to pass to any functions that it calls.
CaH4e3
Posts: 71
Joined: Thu Oct 13, 2005 10:39 am

Re: Koei bytecode

Post by CaH4e3 »

AWJ wrote: Even if I can't run your scripts without IDA, I can port them to my own disassembler :)
enjoy https://pastebin.com/sMW8wLXS
me commented some portions of code, the rest is my own or system library functions...

all nes koei games has a full list of bitecode procedures and can be disassembled automatically using my scripts and ida ;). but the offsets are for my own loader, they has no correlation with real offsets except lower bits...

some opcodes I didn't reversed deeply and just nailed briefly, I took from your description (comments are given). bitfield opcodes never used in the nes bytecode (as I see), as well as the relative branches... so I just didn't care about it by now..
AWJ wrote:Well, I've found one use for the "hole" at frameptr+8. Your tip that there was also compiler-generated native code was just what I needed. Take a look at the routine at 1F/E24B in Ishin no Arashi, 1F/E37A in Rot3K2 or 1F/E2E3 in L'Empereur (the same routine exists in BKoAC and probably most/all of the games, but it's not always in the fixed bank. I guess the games that have it in the fixed bank are the ones with the most compiler-generated native code)

The purpose of this routine seems to be to create a bytecode-compatible stack frame for a native-code function, and to clean up that stack frame when that function exits (it does some 6502 stack manipulation to wedge its cleanup code after the RTS of the "wrapped" function). Anyway, this wrapper routine uses the byte at frameptr+8 to store the number of bytes it has copied from $0080 to the bytecode stack, in order to restore that number of bytes back to $0080 when the wrapped native-code function exits.

It looks like compiler-generated native code uses $0080~ as local variable storage, and each native-code function uses this wrapper to preserve its caller's local variables (if any) and to allocate stack space for the arguments it needs to pass to any functions that it calls.
this is actually a native version of the bytecode exec procedure. it is used for every native procedure start. the same way as for bytecode procedures. but instead this functions executes the real 6502 asm and then returns back to the caller routine. it does the same thing as for bytecode procedures, prepares the stack and executes the native code instead of the bytecode. it gets the same 16-bit signed value to assign the local variables buffer for procedure, but has an extra parameter byte, which is used to backup a number of vm stack bytes to 80-buffer. it will be stored in your 8-byte stack offset. but, I haven't seen any functions used any value apart from 0 here. l'empereur and uncharted waters never used this byte for sure. it's always 0. so copy to 80 buffer never used there. I doubt it used anywhere else... so this is for sure redundant leftover of some planned but never used feature of the compiler to save some stack parameters..
AWJ
Posts: 433
Joined: Mon Nov 10, 2008 3:09 pm

Re: Koei bytecode

Post by AWJ »

Thanks to your script I see what the return address at frameptr+9 is used for, too. It isn't used by the bytecode interpreter but it is used by the native-code function call wrapper (the routine that's approximately equivalent to bytecode opcode $E9). I guess they wanted to make the stack frames identical between bytecode and native-code functions to make debugging easier.

A funny thing about that routine is that the "stack adjust amount" parameter that it takes includes the return address! So even function calls with no arguments need to use a value of 2.
this is actually a native version of the bytecode exec procedure. it is used for every native procedure start. the same way as for bytecode procedures. but instead this functions executes the real 6502 asm and then returns back to the caller routine. it does the same thing as for bytecode procedures, prepares the stack and executes the native code instead of the bytecode. it gets the same 16-bit signed value to assign the local variables buffer for procedure, but has an extra parameter byte, which is used to backup a number of vm stack bytes to 80-buffer. it will be stored in your 8-byte stack offset. but, I haven't seen any functions used any value apart from 0 here. l'empereur and uncharted waters never used this byte for sure. it's always 0. so copy to 80 buffer never used there. I doubt it used anywhere else... so this is for sure redundant leftover of some planned but never used feature of the compiler to save some stack parameters..
It's actually the other way around: on function entry it copies n bytes from $80 to the stack, and on function exit it copies them from the stack back to $80.

I thought that feature might be used for local variables that are pointers (and thus have to be in zero page) but I guess the compiler-generated native code just copies pointer variables from the stack to scratch space (e.g. $08-$14) upon use, rather than keeping them in zero page for the entirety of a function?

I've reverse engineered the entire MMC5 BIOS, all three versions (except a couple of version-1-only syscalls that I can't figure out and might actually be incomplete/nonfunctional) Are you interested in a description or have you already done it yourself? I've also reverse engineered the sound program if anyone is interested (the one used in Mahjong Taikai, Famicom Top Management and all the MMC5 games--the first three MMC1 games have a totally different sound program which I've barely looked at)
CaH4e3
Posts: 71
Joined: Thu Oct 13, 2005 10:39 am

Re: Koei bytecode

Post by CaH4e3 »

AWJ wrote:It's actually the other way around: on function entry it copies n bytes from $80 to the stack, and on function exit it copies them from the stack back to $80.

I thought that feature might be used for local variables that are pointers (and thus have to be in zero page) but I guess the compiler-generated native code just copies pointer variables from the stack to scratch space (e.g. $08-$14) upon use, rather than keeping them in zero page for the entirety of a function?
yeah, you right. but it seem just more like an attempt to have a separate set of bytecode registers for a native procedures which is overwrites those from procedural stack. if used, they may run two separate native procedures with the same set of registers... maybe for cases when some other inbetween procedure changes it somehow... but anyway, it's never used
AWJ wrote:I've reverse engineered the entire MMC5 BIOS, all three versions (except a couple of version-1-only syscalls that I can't figure out and might actually be incomplete/nonfunctional) Are you interested in a description or have you already done it yourself? I've also reverse engineered the sound program if anyone is interested (the one used in Mahjong Taikai, Famicom Top Management and all the MMC5 games--the first three MMC1 games have a totally different sound program which I've barely looked at)
I nailed some functions for some games of my personal interest but maybe someone will like to see it too ;)
AWJ
Posts: 433
Joined: Mon Nov 10, 2008 3:09 pm

Re: Koei bytecode

Post by AWJ »

After disassembling some native-code routines in Ishin no Arashi, I understand some things I didn't understand before. In the compiler-generated native code, zero page address $06-$07 (which in the interpreter is the instruction pointer) is used as a second frame pointer instead, and contains the value frameptr - fastlocals - 256. The "- 256" is because the 6502 doesn't support indexed addressing with a negative index. Local variables on the stack are addressed using ($06),Y where Y = #$FE for the first (word-sized) local, #$FC for the second local, etc. I was wondering why the stack frame setup did DEC $07 before jumping to the wrapped code and INC $07 after returning from it, but now I see why.

ETA: Bandit Kings of Ancient China does use the "fast local variables at $80" feature; here's an example of a function that uses it (output from my disassembler):

Code: Select all

14/AAE0: 20 99 B1      jsr CreateStackFrame
14/AAE3: 04 26 FF      frame 4,-218
14/AAE6: A0 0B         ldy #$0B
14/AAE8: B1 04         lda (frameptr),y
14/AAEA: 85 80         sta $80
14/AAEC: C8            iny
14/AAED: B1 04         lda (frameptr),y
14/AAEF: 85 81         sta $81
14/AAF1: C8            iny
14/AAF2: B1 04         lda (frameptr),y
14/AAF4: 85 82         sta $82
14/AAF6: C8            iny
14/AAF7: B1 04         lda (frameptr),y
14/AAF9: 85 83         sta $83
14/AAFB: 8A            txa
14/AAFC: 48            pha
14/AAFD: 48            pha
14/AAFE: 48            pha
(snip...)
CaH4e3
Posts: 71
Joined: Thu Oct 13, 2005 10:39 am

Re: Koei bytecode

Post by CaH4e3 »

AWJ wrote: ETA: Bandit Kings of Ancient China does use the "fast local variables at $80" feature; here's an example of a function that uses it (output from my disassembler):
I see native code uses it as a vars/regs as well

Code: Select all

BANK14:A61B          loc_20661B:                             ; CODE XREF: BANK14:A64Bj
BANK14:A61B                                                  ; BANK14:loc_20669Cj
BANK14:A61B 18                       CLC                     ; *a
BANK14:A61C A9 07                    LDA     #7              ; *a
BANK14:A61E 65 80                    ADC     byte_80         ; *a
BANK14:A620 85 80                    STA     byte_80         ; *a
BANK14:A622 8A                       TXA                     ; *a
BANK14:A623 65 81                    ADC     byte_81         ; *a
BANK14:A625 85 81                    STA     byte_81         ; *a
BANK14:A627
BANK14:A627          loc_206627:                             ; CODE XREF: BANK14:A618j
BANK14:A627 A1 80                    LDA     (byte_80,X)     ; *a
BANK14:A629 A0 01                    LDY     #1              ; *a
BANK14:A62B 11 80                    ORA     (byte_80),Y     ; *a
BANK14:A62D F0 03                    BEQ     loc_206632      ; *a
BANK14:A62F 4C 4E A6                 JMP     loc_20664E      ; *a
it seems, most of native code may be decompiled the same way as bytecode, except maybe some optimizations... like they just unroll all bytecode with an native asm representing every single command, maybe with some opts
AWJ
Posts: 433
Joined: Mon Nov 10, 2008 3:09 pm

Re: Koei bytecode

Post by AWJ »

Heehee... can you find the bugs in these implementations of the C library functions abs() and strlen()? Hints: stackptr+2 points to the first argument on the stack, and left holds the (16-bit) function return value (meaning left+2 and so forth are effectively scratch space)

Code: Select all

abs:
    ldy #2
    lda (stackptr),y
    sta left
    iny
    lda (stackptr),y
    sta left+1
    bpl done
    eor #$FF
    sta left+1
    lda left
    eor #$FF
    clc
    adc #1
    sta left
done:
    rts

Code: Select all

strlen:
    ldy #3
    lda (stackptr),y
    sta left+3
    dey
    lda (stackptr),y
    sta left+2
    ldy #0
    sty left
    sty left+1
loop:
    lda (left+2),y
    beq done
    iny
    bne loop
    inc left+1
    jmp loop
done:
    sty left
    rts
I'd be surprised if the abs() bug doesn't impact game logic in at least one game, most likely in something non-obvious like AI (not all the games contain all the library functions, but the games that have abs() do call it...)
CaH4e3
Posts: 71
Joined: Thu Oct 13, 2005 10:39 am

Re: Koei bytecode

Post by CaH4e3 »

Abs does not increment higher nibble of negative value after inverting it. Strlen does not increment hi nibble of str pointer as well as the result. But seems they have not so much strings longer than 256 or they rarely need strlen counting...
Optiroc
Posts: 129
Joined: Thu Feb 07, 2013 1:15 am
Location: Sweden

Re: Koei bytecode

Post by Optiroc »

I missed this thread, very interesting stuff!

I recently reverse engineered parts of Super Robot Wars Gaiden Masōkishin, and it turns out that the game is entirely orchestrated by a bytecode interpreter. I couldn’t say if the bytecode is compiled from C, but I would guess not actually.

The instruction set includes all the primitives to facilitate that (the expected load, store, goto, call, return, etc), but a lot of the opcodes do very specific game engine stuff. So the bytecode is maybe more likely compiled from some simple bespoke scripting language. I specifically dug into it to alter the behavior of music playback/queuing, so I’ve not looked at enough bytecode to say anything with certainty. If anyone is interested to look deeper into this I’d be happy to clean up my notes and collaborate on a deeper reverse engineering effort!
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Koei bytecode

Post by tepples »

Does a similar bytecode VM exist in Nobunaga's Ambition for Game Boy? A question came up in gbdev Discord about commercial-era games written in the C language, and it was claimed that Affinix's unreleased Infinity for GBC was the only one.
Post Reply