Help to optimize an "slide in" code

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

Post Reply
User avatar
kikutano
Posts: 115
Joined: Sat May 26, 2018 6:14 am
Location: Italy

Help to optimize an "slide in" code

Post by kikutano »

Hello to everyone,
I'm writing a code that "slide In" the screen from left to right column by column as you can see in this gif:

Image

That's works ok when I try to draw until the 16th row, but if I try to draw all 30 rows I got this:

Image

Probably because my code is too slow... I'm updating a column every tot seconds until the 32.

Code: Select all

FadeInBackground:
    lda #$01
    cmp FADE_STATE
    beq .CanFade

    jmp .Exit

.CanFade:

    MACRO_INC TIMER_FADE_TICKS, #$01
    lda TIMER_FADE_TICKS
    cmp #$02
    beq .FadeTile

    jmp .Exit

.FadeTile:
    lda #$20
    sta ADDR_HIGH_2
    lda FADE_TILE_COUNT_LEFT
    sta ADDR_LOW_2

    lda #HIGH( TileTitleScreen )
    sta ADDR_HIGH
    lda #LOW( TileTitleScreen )
    sta ADDR_LOW

    MACRO_INC ADDR_LOW, FADE_TILE_COUNT_LEFT
    
  .highloop:
      ldy #$00 ;FADE_TILE_COUNT_LEFT
      lda #$00
      sta COUNT

      .loop:
        lda $2002
        lda ADDR_HIGH_2
        sta $2006
        lda ADDR_LOW_2
        sta $2006

        lda [ ADDR_LOW ], y
        sta $2007

        ;sta TILE
        ;MACRO_BackgroundTile ADDR_HIGH_2, ADDR_LOW_2, TILE
        MACRO_INC ADDR_LOW_2, #$20
        
        MACRO_INC COUNT, #$20 ;usare un dey
        ldy COUNT
    
        cpy #$00
      bne .loop

      MACRO_INC ADDR_HIGH_2, #$01
      MACRO_INC ADDR_HIGH,   #$01

      lda ADDR_HIGH_2
      cmp #$24
  bne .highloop    

  lda #$00
  sta TIMER_FADE_TICKS
  MACRO_INC FADE_TILE_COUNT_LEFT, #$01

  lda FADE_TILE_COUNT_LEFT
  cmp #$20
  bne .Exit

  lda #$02
  sta FADE_STATE

.Exit:
  rts
Probably the problem is here:

Code: Select all

lda $2002
        lda ADDR_HIGH_2
        sta $2006
        lda ADDR_LOW_2
        sta $2006

        lda [ ADDR_LOW ], y
        sta $2007
Anyone can help me to understand how and if I can optimize the code?

Thanks a lot!
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Help to optimize an "slide in" code

Post by lidnariq »

The PPU has a "increment PPU address by 32" mode that should help here, such that you don't need to set the PPU address before every write. It doesn't look like you're using it?
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Help to optimize an "slide in" code

Post by tokumaru »

The vertical blank (vblank), the period during which you have free access to VRAM, lasts a very limited amount of time. To make the most out of that time, you're supposed to just blast pre-calculated data to VRAM and not do any complicated data processing. The problem with your code is that you're doing a lot of unnecessary things for every single byte you copy, and that is indeed blowing your vblank time budget. With reasonably optimized code, one can write well over 100 bytes of data to VRAM during vblank, but you're having trouble after only 15!

The very first thing you have to do is stop setting the VRAM address for every byte! The PPU has an auto-increment feature that takes care of moving the address register over to the next position (either vertically or horizontally) after each access to $2007. You just need to indicate via bit 2 of PPU register $2000 whether you want the address to increment by 1 (for drawing rows of tiles) or 32 (for drawing columns of tiles), and then you can set the target name table address just once when starting the column, and the PPU will automatically advance to the next position after each byte written. You still have to update the source address manually though, which brings us to the second point...

While the change above by itself could be enough to solve your current problem, that "MACRO_INC" thing you're doing looks like a huge time waster, specially when called multiple times for each byte like you're doing. One thing you could consider doing is store the screen data rotated by 90 degrees in PRG-ROM, so you can simply increment Y to advance to the next source byte, so you don't need to update the pointer itself as often. If storing the screen sideways is not something you're willing to do, make sure you're incrementing that source address as efficiently as possible, or consider using other addressing modes with (partially or completely) unrolled loops.
User avatar
kikutano
Posts: 115
Joined: Sat May 26, 2018 6:14 am
Location: Italy

Re: Help to optimize an "slide in" code

Post by kikutano »

tokumaru wrote: The very first thing you have to do is stop setting the VRAM address for every byte! The PPU has an auto-increment feature that takes care of moving the address register over to the next position (either vertically or horizontally) after each access to $2007. You just need to indicate via bit 2 of PPU register $2000 whether you want the address to increment by 1 (for drawing rows of tiles) or 32 (for drawing columns of tiles), and then you can set the target name table address just once when starting the column, and the PPU will automatically advance to the next position after each byte written. You still have to update the source address manually though, which brings us to the second point...
Mmm ok! The first thing I will try to set the auto increment by 32.
User avatar
kikutano
Posts: 115
Joined: Sat May 26, 2018 6:14 am
Location: Italy

Re: Help to optimize an "slide in" code

Post by kikutano »

Ok, now works great! I just set the increment by 32, and delete the resetting of PPU Address at every cycle in this way:

Code: Select all

lda #%00000100
    sta $2000
And now everything works fine. Thanks you as always! :)
User avatar
samophlange
Posts: 50
Joined: Sun Apr 08, 2018 11:45 pm
Location: Southern California

Re: Help to optimize an "slide in" code

Post by samophlange »

tokumaru wrote: The very first thing you have to do is stop setting the VRAM address for every byte! The PPU has an auto-increment feature that takes care of moving the address register over to the next position (either vertically or horizontally) after each access to $2007. You just need to indicate via bit 2 of PPU register $2000 whether you want the address to increment by 1 (for drawing rows of tiles) or 32 (for drawing columns of tiles), and then you can set the target name table address just once when starting the column, and the PPU will automatically advance to the next position after each byte written.
Just chiming in to say thanks for explaining details like this. I haven't had much time to code recently but I feel like I'm still learning a lot by reading other people's threads. :beer:
User avatar
kikutano
Posts: 115
Joined: Sat May 26, 2018 6:14 am
Location: Italy

Re: Help to optimize an "slide in" code

Post by kikutano »

samophlange wrote:
tokumaru wrote: The very first thing you have to do is stop setting the VRAM address for every byte! The PPU has an auto-increment feature that takes care of moving the address register over to the next position (either vertically or horizontally) after each access to $2007. You just need to indicate via bit 2 of PPU register $2000 whether you want the address to increment by 1 (for drawing rows of tiles) or 32 (for drawing columns of tiles), and then you can set the target name table address just once when starting the column, and the PPU will automatically advance to the next position after each byte written.
Just chiming in to say thanks for explaining details like this. I haven't had much time to code recently but I feel like I'm still learning a lot by reading other people's threads. :beer:
Yeah, It's amazing how many things you can learn on this forum. Great users! :beer: :beer:
Post Reply