To teach a smug 68000 weenie how to make a 6502 scream, you might have to show him a few peephole optimizations one at a time.
Remove loads after stores:
Code: Select all
lda $00
clc
adc $01
sta $00 ;; remove lda $00 after store
clc
adc $02
sta $00
lda $1000 ;; move $1000 to the $03 so it can run faster
sta $03 ;; remove lda $03 after store
clc
adc $00
sta $03
sta $1000
Remove stores whose value is provably unused:
Code: Select all
lda $00
clc
adc $01
clc ;; remove unused sta $00
adc $02
sta $00
lda $1000
clc
adc $00
sta $03
sta $1000
Addition of this type is commutative:
Code: Select all
lda $00
clc
adc $01
clc
adc $02
sta $00
lda $00 ;; group accesses to same address
clc
adc $1000
sta $03
sta $1000
Which allows removing another load after store:
Code: Select all
lda $00
clc
adc $01
clc
adc $02
sta $00 ;; remove lda $00 after store
clc
adc $1000
sta $03
sta $1000
Thus this section of code is provably equivalent yet small enough for repeating unused store analysis with $00 and $03 in the rest of the snippet. If it turns out they're not needed, you end up with perfectly idiomatic 6502 assembly:
Code: Select all
lda $00
clc
adc $01
clc
adc $02 ;; remove unused sta $00
clc
adc $1000 ;; remove unused sta $03
sta $1000
You already know all this, but the process illustrated in this example might help someone else adapt to the 6502. Each step might produce its own "Oh!" moment.
(And now this is used as an
example on wiki.superfamicom.org.)