edit: Ok I'm going to think more about your simple question. It seems like a hard question right now.
It is up to you to define every bit about how this should work. It works as defined or it works by breaking the definition.
1. Is it defined such that upon striking zero, the zeroth column is drawn? (Zero should draw zero, negative is impossible)
2. Is it defined such that upon striking one, the zeroth column is drawn? (One should draw zero, zero would be negative so that update is skipped)
Something else? If you don't know how it
should work, think about what makes the most sense, and commit to it. Then if something works by doing something that doesn't seem to agree with the definition, you can find out why and fix it. (Or fix your definition, if the working way makes sense and is actually better.) Just... don't guess. Find out why it works if it's better or if something else is wrong.
A bit more theory! Requires... well, somewhat large rewrites to put into place.
Think about the fastest way your updates could possibly happen.
Something like this?
Code: Select all
loop:
lda buffer,y;4
sta $2007;5
dey;2
bpl loop;3 on all but last
(Well... unrolled or partially unrolled or stack magic would be faster. But follow me on this.
)
Obviously you need to load the value and store it. That's unavoidable. Then the end of your loop with just one dey. Why not set your buffer up beforehand so you can do exactly that? Or exactly whatever the fastest thing you can think of is?
Currently, you have the left and right columns interleaved. (left column tile) (right column tile) (left column tile) (right column tile) This means you have to loop through the list twice, and also use dey twice for each loop!
If you did this: (left column tile) (next left column tile) (etc.) ... (right column tile) (next right column table) (etc.)
One dey would take you to the next tile. Do you have to decide whether a column is even or odd? In your case, both are updated when they need updating, so you can just draw them in the same order.
Code: Select all
lda evencolumnaddrhi;3
sta $2006;4
lda evencolumnaddrlo;3
sta $2006;4
ldy #29;2
loop:
lda buffereven,y;4
sta $2007;5
dey;2
bpl loop;3 on all but last
lda oddcolumnaddrhi;3
sta $2006;4
lda oddcolumnaddrlo;3
sta $2006;4
ldy #29;2
loop2:
lda bufferodd,y;4
sta $2007;5
dey;2
bpl loop2;3 on all but last
The above is pretty bare minimum, and still takes ~870 cycles assuming no page crosses. Those 870 cycles do not include the ~513 for sprites. They do not include attribute updates. This is why your NMI needs to make as few decisions as possible. Your goal outside of the NMI should be to make all the decisions and set the data up so that the NMI can use it in the fastest possible way.
In your case, you do even, you do odd. If the routine that updates the buffers just used the same place in RAM every time, you wouldn't need a pointer for your NMI updates. You need pointers to the metatiles in case you have different sets, but the buffer for the NMI can be static. Heck using a pointer takes an extra cycle per load, plus you have to set it up. Static all the way!
Now, I didn't mention it before because it would not have helped your issue. But you can also make draw_RAMbuffers both simpler and faster. (Using a non interleaved buffer format or not!)
In the current code, you work really hard to preserve y (it contains where you are in the pointer.). But think about this! It takes just six cycles to store and restore it, and you could REALLY use it for other stuff.
I may not fully understand draw_RAMbuffers, but I think you can do something like this:
Code: Select all
;Metatile index is Y. Location in RAM buffer is in X.
lda MetatileTile0, y;Assuming this top left tile
sta RAMbuffereven, x;Even buffer
lda MetatileTile1,y;Assuming this is top right tile
sta RAMbufferodd, x;Odd buffer
dex;Takes us to the next tile for BOTH buffers
lda MetatileTile2, y;Assuming this bottom left tile
sta RAMbuffereven, x;Even buffer
lda MetatileTile3,y;Assuming this is bottom right tile
sta RAMbufferodd, x;Odd buffer
lda pointerposition;used to be tya. You lose just one cycle doing this instead
clc
adc #$10 ;increment y by 16!!!!
tay
dex
bpl
This avoids storing the tiles to temp RAM only to load them again. You only need to loop 30 times, and it covers two separate columns. You only lose one cycle from where the 16 is added to y, plus 3 for storing it someplace (not above, but of course needs to be done). But, because you no longer need to store/restore x in goodlocation you actually come out ahead. (Since you needed to move y anyway which was replaced with a load, but you didn't ever need to move x.) The added benefit is you can use y for something that really needs it.
I'm not sure if you're updating two 8x8 tiles, or two 16x16 tiles columns. It looks like your draw_RAMbuffers is doing two 16x16 columns, but I don't see much need for that. It's be tough to update that much in the NMI anyway.
You can also only update one 8x8 column at a time in your NMI. Even if you set up an even and odd buffer outside of the NMI, you can have the NMI draw the relevant column (just even or just odd) when scrolled to. It's not a problem if the data you setup isn't used exactly on the frame.
Anyway, enough from me, I start these posts and never stop writing...