Okay, I haven't really sifted through the entire program/code to figure out what all is going on here, but let's talk about the code and what it's actually doing. First, MetaspriteTest.asm has this around label InfiniteLoop. Apologies for formatting mistakes, as this code uses hard tabs rather than spaces (but not consistently):
Code: Select all
InfiniteLoop:
WAI
ldy #$00
ldx #MetaspriteTable
jsr start_metasprite
...
MetaspriteTable:
.DB $FF,$FF,$00,$00,$00,$00,$00,$00,$FF,$FF,$00,$10,$00,$00,$00,$00,$00,$00
This code gets turned into the below, effectively due to run-time register sizes and so on:
Code: Select all
wai
ldy #$0000
ldx #$8405
jsr $820e
In the debugger (this took me a bit to do). Best place to set an exec breakpoint is at $8312.
Code: Select all
Disassembly:
$00/8311 CB WAI A:0000 X:0000 Y:0000 P:envmxdIZC
$00/8312 A0 00 00 LDY #$0000 A:0000 X:0000 Y:0000 P:envmxdIZC
$00/8315 A2 05 84 LDX #$8405 A:0000 X:0000 Y:0000 P:envmxdIZC
$00/8318 20 0E 82 JSR $820E [$00:820E] A:0000 X:0000 Y:0000 P:envmxdIZC
$8405 happens to be the 16-bit address (pre-calculated during assembly-time) of the memory location of MetaspriteTable. Let's start by asking: is that the correct address? Let's find out. And of course SNES9x won't let me copy/paste the Hex Editor portion... wonderful, so I get to type all of this in manually:
Code: Select all
008400 8D 10 21 28 60 FF FF 00 00 00 00 00 00 FF FF 00
008410 10 00 00 00 00 00 00 FF FF FF FF FF FF FF FF FF
008420 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF
That looks correct. So with that in mind, let's go look at start_metasprite and see what it's doing with the X register.
Code: Select all
start_metasprite:
php
rep #$10
sep #$30
build_metasprite:
lda $0000,x
beq metasprite_done
inx
lda $0000,x
clc
Initially this look right, but I can already see multiple catastrophic bugs given the assumptions of the programmer vs. what the processor will do. Let's see what the real-time debugger has to say:
Code: Select all
SNES reset.
$00/823C 78 SEI A:0000 X:0000 Y:0000 P:EnvMXdIZC
$00/8312 A0 00 00 LDY #$0000 A:00FF X:0000 Y:1000 P:envMxdiZC
$00/8315 A2 05 84 LDX #$8405 A:00FF X:0000 Y:0000 P:envMxdiZC
$00/8318 20 0E 82 JSR $820E [$00:820E] A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/820E 08 PHP A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/820F C2 10 REP #$10 A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8211 E2 30 SEP #$30 A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8213 B5 00 LDA $00,x [$00:0005] A:00FF X:0005 Y:0000 P:eNvMXdizC
$00/8215 F0 23 BEQ $23 [$823A] A:0000 X:0005 Y:0000 P:envMXdiZC
Your REP/SEP are in the wrong order. By doing REP #$10 / SEP #$30, you are setting 16-bit indexes, then setting 8-bit accumulator and 8-bit indexes. There are two ways to solve this, but one is wrong and the other is right. These are your options:
Code: Select all
rep #$30 ; A=16, X/Y=16
sep #$20 ; A=8
Or:
Code: Select all
sep #$30 ; A=8, X/Y=16
rep #$10 ; X/Y=16
Guess which one is the correct way? The first one. The 2nd one will introduce a horrible bug: you'll lose the upper byte of the X/Y index registers -- it'll be zeroed. This happens on the 65816 and ONLY with the index registers. You can swap between 16-bit and 8-bit accumulator without the full 16-bit contents being affected, but with indexes, upon going to 8-bit you lose the upper byte. In fact, that's happening with your code already. Look closely at the contents of X:
Code: Select all
$00/8211 E2 30 SEP #$30 A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8213 B5 00 LDA $00,x [$00:0005] A:00FF X:0005 Y:0000 P:eNvMXdizC
See how it goes from $8405 to $0005, all because you set 8-bit indexes?
So let's go with REP #$30 / SEP #$20 and see how things look after that:
Code: Select all
$00/8318 20 0E 82 JSR $820E [$00:820E] A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/820E 08 PHP A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/820F C2 30 REP #$30 A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8211 E2 20 SEP #$20 A:00FF X:8405 Y:0000 P:eNvmxdizC
$00/8213 B5 00 LDA $00,x [$00:8405] A:00FF X:8405 Y:0000 P:eNvMxdizC
$00/8215 F0 23 BEQ $23 [$823A] A:00FF X:8405 Y:0000 P:eNvMxdizC
Much better.
I should also point out here: you
should not be using
.dw $ffff,... like we discussed earlier. You are using an 8-bit accumulator, not a 16-bit accumulator (despite what you said earlier). And your build_metasprite routine is coded to use 8-bit accumulators as well. If you were to turn on 16-bit accumulator, your routine would break (look closely at where you're storing the results in SpriteBuf1 and what hard-coded math you're using there!).
I hope this has been a lesson in why having an assembler that generates proper/decent listings is VERY IMPORTANT. The fact WLA DX can't do this sanely/correctly is ridiculous.
I see other bugs in this program though, depending on how intelligent WLA DX is about knowing about addressing modes and banks, and when those crop up you're going to be crying big tears. Case in point:
lda $0000,x right now is getting assembled into $b5 $00 (LDA directpage,X) because your MetaspriteTable data happens to be within the same bank as your the code that's running (in bank $00).
Eventually you're going to have to break outside of that (dealing with multiple banks); for example if MetaspriteTable was in a different bank (say bank $01 or $81 (same thing in this memory mode)), then the above code would be wrong and manifest itself by misbehaving in real-time: you'd be loading the wrong data: it'd be coming from bank $00 (where direct page is hard-coded to live) rather than where B was. In other words:
Code: Select all
rep #$10
ldx #MetaspriteTable
sep #$20
lda #$01
pha
plb
lda $0000,x
WLA DX may end up screwing you by optimising
lda $0000,x into $b5 $00 (LDA directpage,X), rather than $ad $00 $00 (LDA absolute,X). The former would get you whatever bytes happened to be in bank $00 address $0000 + X, the latter would get you whatever bytes happened to be in bank $01 address $0000 + X.
The only way to "force" the assembler into knowing this is to use the
.w modifier, i.e.:
Which will ALWAYS assemble to the 16-bit absolute address + opcode ($ad $00 $00).
Alternately you could use full 24-bit addressing ("long addressing") on all of your stuff that's not explicitly in direct page, through the
.l (dot-ELL) modifier. Be aware that 24-bit addressing takes up 1 more byte, and takes 1 more cycle than absolute. This added cycle can add up real fast when doing loops, which is why before many loops you'll find people changing what B points to and then using absolute addressing.
Food for thought.