russellsprouts wrote:
I also rewrite all of the palettes every frame using self-modifying code. I have a routine in ram which uses load immediate instructions, so all the bytes of the palettes are sent to the PPU in just over 200 cycles.
I think that self-modifying code is a little overkill for palettes, but the fact that you used it shows that you really want to do this as fast as possible, so here are a couple of optimization tips that will make palette updates even faster:
1- The NES doesn't have 32 active colors, only 25. Unless you want to use the trick where you point the VRAM to $3F04, $3F08 and $3F0C when rendering is disabled to show those colors (something I personally can't think of many uses for in an actual game), you can cut the update time down by 14 cycles if you simply don't load those 7 bytes that don't mean anything. Even if you do want to use that trick, you can still get rid of 4 load operations, saving you 8 cycles.
2- Start updating from $3F01 instead of $3F00. Writing color 0 to $3F00 is redundant, since it has a mirror at $3F10, that you will be writing to later. This means you can save 4 more cycles by not writing a byte that will be overwritten later anyway.
Code:
;set the target address (12 cycles)
lda #$3F
sta $2006
lda #$01
sta $2006
;update the first palette (18 cycles)
lda #$XX
sta $2007
lda #$XX
sta $2007
lda #$XX
sta $2007
;load the background color in another register (2 cycles)
ldx #$XX
;update the next 7 palettes (7 x 22 = 154 cycles)
stx $2007
lda #$XX
sta $2007
lda #$XX
sta $2007
lda #$XX
sta $2007
;(...)
That's 186 cycles, which would reduce a bit the impact of doing palette updates every vblank.