Large updates to nametable

Are you new to 6502, NES, or even programming in general? Post any of your questions here. Remember - the only dumb question is the question that remains unasked.

Moderator: Moderators

Post Reply
User avatar
Goose2k
Posts: 65
Joined: Wed Dec 11, 2019 9:38 pm

Large updates to nametable

Post by Goose2k » Mon May 25, 2020 10:04 pm

I am working on a little Tetris clone, and something that has me a little stumped is how to update large sections of the nametable in a frame.

This comes up when the player clears a line.

I store all of the board in a linear array of 240 unsigned chars, as well as writing that state to the nametable itself when a block lands. When a line is cleared, I need to shift the entire center portion of the nametable down (to appear like the blocks are falling), as well as updating the logical array of board state.

When I try to update that much of the nametable in a single frame, I think I am blowing the VRAM buffer (from nesdoug's CC65 library).

To get around this, I shift one row a frame. So clear the bottom row it takes 20 frames (1 frame for each row of the board).

nes_tetris_line_clear.gif
Click for animation.

Any thoughts on a better way to handle this?

Some relevant code:

Line clearing code. Runs every frame.

Code: Select all

				
// Search for full rows to clear out.
if (do_line_check)
{
	// Stop searching for lines unless we fine one this frame.
	do_line_check = 0;

	// Start at the bottom of the board, and work our way up.
	for (iy = BOARD_END_Y_PX_BOARD; iy > BOARD_OOB_END; --iy)
	{
		// Assume this row is complete unless we find an empty
		// block.
		line_complete = 1;
		for (ix = 0; ix <= BOARD_END_X_PX_BOARD; ++ix)
		{
			if (is_block_free(ix, iy))
			{
				// This block is empty, so we can stop checking this row.
				line_complete = 0;
				break;
			}
		}

		// If this row was filled, we need to remove it and crush
		// the rows above it into its place.
		if (line_complete)
		{
			// Store line to crush.
			line_crush_y = iy;
			break;
		}

		// found a line so there might be more.
		//do_line_check = 1;
	}
}

// Are we currently shifting rows down?
if (line_crush_y > BOARD_OOB_END)
{
	// Set each block in this row to the value in the row above it.
	for(ix = 0; ix <= BOARD_END_X_PX_BOARD; ++ix)
	{
		set_block(ix, line_crush_y, get_block(ix, line_crush_y-1));
	}

	// Next frame do the same on the line above.
	--line_crush_y;

	// Finished this pass, check again incase this was a multi-line
	// kill.
	if (line_crush_y == BOARD_OOB_END)
	{
		do_line_check = 1;
	}
}

Code: Select all

void set_block(unsigned char x, unsigned char y, unsigned char id)
{
	int address;

	// w = 10 tiles,  80 px
	// h = 20 tiles, 160 px

	// Update the logic array as well as the nametable to reflect it.

	if (y <= BOARD_OOB_END)
	{
		// Don't place stuff out of bounds.
		return;
	}

	address = get_ppu_addr(0, (x << 3) + BOARD_START_X_PX, (y << 3) + BOARD_START_Y_PX);
	one_vram_buffer(id, address);

	//x = x >> 3; // div 8
	//y = y >> 3; // div 8

	// TODO: Is this too slow?
	game_board[PIXEL_TO_BOARD_INDEX(x,y)] = id;
}

User avatar
tokumaru
Posts: 11691
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Large updates to nametable

Post by tokumaru » Tue May 26, 2020 12:00 am

Programs are only allowed to access VRAM when the PPU itself is not using it (i.e. it's not rendering an image), so you typically only have the time that corresponds to the 20 scanlines of vblank, which's about 2273 CPU cycles. If I understand correctly, you want to update 200 bytes (10x20 tiles) in one go, right? That's doable, but not with the naive/generic/unoptimized VRAM update code you commonly see being used. I don't know how nesdoug's update code does it's thing, but you may have to write your own optimized VRAM update routine in ASM in order to be able to update this much data in a single vblank.

Firstly, you should do the VRAM writes in column mode ("increment 32" mode), so that you only need to set the VRAM address 10 times, as opposed to 20 times in row mode. Then, an unrolled loop of 20 LDA-STA pairs of instructions will transfer the 20 bytes of each column at the cost of 8 CPU cycles per byte:

Code: Select all

	ldy #$95 ;low byte of the VRAM address of the last column
	ldx #9 ;index of the last column

UpdateColumn:

	;set the VRAM address (10 cycles)
	lda #$20
	sta $2006
	sty $2006

	;copy 20 bytes (1 column) from the array to VRAM (160 cycles)
	lda array+0, x ;1 byte from the 1st row
	sta $2007
	lda array+10, x ;1 byte from the 2nd row
	sta $2007
	(...)
	lda array+180, x ;1 byte from the 19th row
	sta $2007
	lda array+190, x ;1 byte from the 20th row
	sta $2007
	
	;go to the previous column (9 cycles)
	dey
	dex
	bmi Done ;exit when x wraps around to 255
	jmp UpdateColumn

Done:
If you count the cycles, you'll see that 10 iterations of this loop will take roughly 1790 CPU cycles to complete, leaving you with about 480 cycles left to do other PPU-related things (change the palette, set the scroll, etc.). Unfortunately, there's not enough time to do a sprite DMA (which takes over 512 cycles), so unless you're able to optimize this further (it's possible, but there will be more serious compromises, like using up a good portion of ZP for the screen buffer), you will not be able to update the entire playfield and do a sprite DMA in the same frame.

strat
Posts: 355
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Re: Large updates to nametable

Post by strat » Tue May 26, 2020 3:16 pm

On top of the basics provided by Tokumaru this is how I'd handle it:

-Keep a copy of the background in the offscreen nametable.

-Play a simple animation when a line is completed so the updated field doesn't have to appear right away. Most versions of Tetris do this.

-Update the offscreen nametable (still must be done in vblank) over several frames while that animation runs. Swap out nametables to display the updated field. This way you have enough cycles to dma sprites if applicable (and I think if this is going to run on hardware you want to constantly refresh OAM regardless.).

This solution is specific to Tetris which thankfully is a single-screen game. If you had a scrolling game and wanted fancy background animations that would be more involved but still remotely possible, probably needing a mapper chip.

User avatar
tokumaru
Posts: 11691
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Large updates to nametable

Post by tokumaru » Tue May 26, 2020 3:30 pm

Great suggestion! Since you have a spare name table, you can update the playfield in that one off screen in the course of several frames and switch to it once done. This technique is called double buffering, and is probably the best solution here if you don`t want to go through the trouble of writing a customized vblank handler in ASM.

User avatar
Goose2k
Posts: 65
Joined: Wed Dec 11, 2019 9:38 pm

Re: Large updates to nametable

Post by Goose2k » Tue May 26, 2020 3:48 pm

strat wrote:
Tue May 26, 2020 3:16 pm
On top of the basics provided by Tokumaru this is how I'd handle it:

-Keep a copy of the background in the offscreen nametable.

-Play a simple animation when a line is completed so the updated field doesn't have to appear right away. Most versions of Tetris do this.

-Update the offscreen nametable (still must be done in vblank) over several frames while that animation runs. Swap out nametables to display the updated field. This way you have enough cycles to dma sprites if applicable (and I think if this is going to run on hardware you want to constantly refresh OAM regardless.).

This solution is specific to Tetris which thankfully is a single-screen game. If you had a scrolling game and wanted fancy background animations that would be more involved but still remotely possible, probably needing a mapper chip.
That is such a great idea, thanks! I'll give it a shot tonight.

And thanks for the example code tokumaru!

User avatar
Goose2k
Posts: 65
Joined: Wed Dec 11, 2019 9:38 pm

Re: Large updates to nametable

Post by Goose2k » Tue May 26, 2020 4:38 pm

strat wrote:
Tue May 26, 2020 3:16 pm
Swap out nametables to display the updated field.
By 'swap out', do you mean set the scroll to instantly have the new nametable in view, or is there a way to actually swap out the entire name tables instantly, such that the content of nametable C is now in Nametable A (for instance)?

I ask because it somewhat complicates things to be in an arbitrary nametable at any given time (eg. which version of NTADR_* macro to use).

Edit: Although thinking about it more, I could scroll to temp buffer which has a copy of current state, while I update the primary nametable, then scroll back to the primary nametable when its done, so majority of the logic can assume a particular nametable.

User avatar
tokumaru
Posts: 11691
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Large updates to nametable

Post by tokumaru » Tue May 26, 2020 5:40 pm

Goose2k wrote:
Tue May 26, 2020 4:38 pm
By 'swap out', do you mean set the scroll to instantly have the new nametable in view
Yup, use the name table bits of PPUCTRL ($2000).
is there a way to actually swap out the entire name tables instantly, such that the content of nametable C is now in Nametable A (for instance)?
Some mappers allow you to rearrange name tables like that, but that doesn't affect the contents of the memory, it just makes the chunks of memory visible in different address ranges.
I ask because it somewhat complicates things to be in an arbitrary nametable at any given time (eg. which version of NTADR_* macro to use).
I don't know how it's done in C, but you will indeed have to keep track of which name table is visible (in order to set the scroll) and which one is hidden (so you can write data to it), and you'll need to swap those variables every time you need to switch the two name tables.

User avatar
Goose2k
Posts: 65
Joined: Wed Dec 11, 2019 9:38 pm

Re: Large updates to nametable

Post by Goose2k » Tue May 26, 2020 9:34 pm

SUCCESS!!

Thanks so much! Still need to add a little flare to mask that time, but functionally it's working.
nes_tetris_line_clear_clean.gif
Click for animation

User avatar
tokumaru
Posts: 11691
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Large updates to nametable

Post by tokumaru » Tue May 26, 2020 10:07 pm

Why is it taking 80 or so frames to clear the whole thing, though? Is there any reason why you can't do 1 column (20 bytes) per frame and finish the update in 10 frames?

User avatar
Goose2k
Posts: 65
Joined: Wed Dec 11, 2019 9:38 pm

Re: Large updates to nametable

Post by Goose2k » Tue May 26, 2020 10:28 pm

I am still doing my original, slow, multi-pass loop, on a row by row basis.
row_by_row.png
I start at the bottom (row 0), and if it is a complete row, I start the copy process. I copy the row 1 into row 0, both in logic array, and nametable. Then I copy row 2 into row 1. And so on until I cover all 20 rows. Then I go back to row 0 (what used to be row 1) and check for any more complete rows, and if I find any, perform the whole thing again (as is the case in a multi-row clear like the picture above).

Although I think implementing the ASM you provided is beyond me at this point, I think the crux of it is that I should first be updating all my logical array data first, on the CPU, and once that is done go column by column making vertical nametable writes. Rather than trying to updating both logic and ppu at the same time.

As an aside, I took at look at Tetris for the NES, and they don't actually seem to do any nametable tricks. In fact all 4 logical nametables appear to be 100% in sync at all times. Not sure what that's all about. Maybe a mirroring mode I don't know about yet (eg. mirror all?).

Here's a shot from the moment I am clearing a row of 3 (it visually clears them from the center out, 2 columns at a time). I am guessing they are doing exactly what you suggested for me.
clearing.png

User avatar
tokumaru
Posts: 11691
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Large updates to nametable

Post by tokumaru » Tue May 26, 2020 11:04 pm

Goose2k wrote:
Tue May 26, 2020 10:28 pm
I start at the bottom (row 0), and if it is a complete row, I start the copy process. I copy the row 1 into row 0, both in logic array, and nametable. Then I copy row 2 into row 1. And so on until I cover all 20 rows. Then I go back to row 0 (what used to be row 1) and check for any more complete rows, and if I find any, perform the whole thing again (as is the case in a multi-row clear like the picture above).
Oh, I see... the game logic needs multiple passes over large portions of the field, so if you clear 4 rows at the bottom, the engine will move all 20 rows 4 times, arriving at the count of 80 frames that can be observed in the animation you posted.

Still, I don't think you should be updating the name tables all those times... Wouldn't it be better to have the code do all those passes without touching the PPU until the playfield is stable, and only then copy exactly 20 rows (or better yet, 10 columns), so that the overall screen update time is much shorter and doesn't vary depending on the number of rows being cleared?
Although I think implementing the ASM you provided is beyond me at this point
The ASM code I posted is only useful if you want to update hundreds of bytes in a single vblank, which's not the case.
I think the crux of it is that I should first be updating all my logical array data first, on the CPU, and once that is done go column by column making vertical nametable writes. Rather than trying to updating both logic and ppu at the same time.
Exactly!
In fact all 4 logical nametables appear to be 100% in sync at all times.
4 identical name tables is usually a sign that 1-screen mirroring is being used (the emulator should be reporting that somewhere!), or the game may be double buffering but the name table viewer doesn't update fast enough for you to notice any differences.

strat
Posts: 355
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Re: Large updates to nametable

Post by strat » Wed May 27, 2020 7:17 pm

Nintendo's Tetris updates 4 horizontal rows every frame when lines are completed and blanks out some descending rows briefly to hide the processing. It uses single-screen mirroring.

User avatar
Goose2k
Posts: 65
Joined: Wed Dec 11, 2019 9:38 pm

Re: Large updates to nametable

Post by Goose2k » Thu May 28, 2020 10:29 am

strat wrote:
Wed May 27, 2020 7:17 pm
Nintendo's Tetris updates 4 horizontal rows every frame when lines are completed and blanks out some descending rows briefly to hide the processing. It uses single-screen mirroring.
I was just looking at videos of Tetris last night and noticed that as well.

The sequence of events is something like this (CPU data refers to logical array of unsigned chars representing the game board):

1: Remove the completed lines in CPU data.
2: Copy updated CPU data to Nametable, 2 columns at a time, from the center out.
3: Collapse all empty rows on the CPU data.
4: Copy updated CPU data to Nametable, 4 (I think) rows at a time, from top to bottom.
tetris_flow.jpg

Post Reply