You have a
much harder problem. After the NMI happens, you have about 2270 "cycles" to safely write to parts of the PPU. A "cycle" is a bit like a measure of time. Each instruction takes a certain amount of time.
From NMI:
to
STA $4014
in your program takes 2427 cycles. You have to optimize your code for speed and that's... not an easy topic to cover. But to start with, here's two facts.
1. The NMI lets you know when a brief period of time starts that allows you to write to places like $2007 safely while the screen is being rendered.
2. In your "Forever" loop, you have quite a lot of time, but
can't write to places like $2007 safely.
The solution: Do absolutely everything you can in your forever loop, short of actually writing to $2007.
Your NMI could look something like this:
Code:
NMI:
PHA
TXA
PHA
TYA
pha
lda #$20;Always write to one nametable for the example
sta $2006
lda #$00
sta $2006
LDA #%10110000
sta $2000;increment by one
ldy #0
nmiloop:
lda $0500,y;4 cycles
sta $2007;4 cycles
iny ;2 cycles
cpy #32;2 cycles
bne nmiloop;3 cycles taken
LDA #$00
STA $2003
LDA #$02
STA $4014 ; sprite DMA from $0200
LDA #%10110000
ora nametablescroll
sta $2000
lda #$00;reset scroll to zero
sta $2005
sta $2005
INC NMIB+1
PLA
TAY
PLA
TAX
PLA
RTI ; return from interrupt
That copy loop could be made faster, but let's keep it simple for now.
This makes it so your NMI does very few things.
1. Pushes your registers (X, Y and A) to the stack. This is pretty much required.
2. Write the top row of nametable0's address to $2006
3. Reads 32 bytes from $0500-$051F and stores them to $2007.
So effectively, it copies 32 bytes from $0500 to the top row of the first nametable.
4. Sets your scroll to 0, 0.
5. increments NMIB+1 so your main loop stops waiting.
6. pulls your registers from the stack. (Also pretty much required.)
Since your NMI (in this example) is copying bytes from $0500-$051F, your next step is to ready that data in your forever loop. So code like DrawNewColumn would be run in the forever loop, but instead of storing to $2007, it'd store to $0500,y. And then when the next NMI happens, it will be read from $0500,y and written to $2007 when it's safe to do so.
This is a super simplified way to approach this, but that's the theory. Say you want to copy to a different address. No problem. Create two variables to store the address you want to write to in your forever loop, and then read that from RAM in your NMI. Say you don't want to draw a new row every frame. No problem. Create a variable that says whether your NMI should copy a new row. Set it in your Forever loop. Read it in your NMI and skip the $2007 writes if it says to. Say you want to copy new columns instead of rows. No problem. Create a new variable that specifies the type of copying to be done.
You want your NMI to be making as few decisions as possible because the time is very limited, so this approach (write to $0500 or elsewhere in RAM while you have a lot of time, then directly copy when you don't) is a good one.