Advice for artifact free 4-way scrolling

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Advice for artifact free 4-way scrolling

Post by Bregalad »

NewRisingSun wrote:I must say I find occasional sprite pop-up less annoying than constantly-black left 8 columns, especially if the graphics obviously weren't designed for it. A Boy and His Blob is the perfect example: thanks to the black bar on the left, the entire red border becomes assymetric. Ugly.
On a real TV, you won't notice any black border, let alone with a real CRT TV. However if the graphics aren't designed around it it's another problem. I must say 31 columns is not a very easy number to deal with
User avatar
bleubleu
Posts: 108
Joined: Wed Apr 04, 2018 7:29 pm
Location: Montreal, Canada

Re: Advice for artifact free 4-way scrolling

Post by bleubleu »

Hi!

First of all, thanks everyone for the advice. I'm not going to reply to everyone but i did read the whole thing. I managed to make the whole thing work, the code is somewhat elegant, i think.

One thing I would recommend anyone doing that kind of thing is to take a few hours to create yourself a little reference implementation in a language that is a bit more expressive/flexible than ASM. I made myself a little C# control that behaves exactly like a PPU and and can show me which tiles/attributes are updated (red = tile, yellow = attribute). It allowed me to figure out what my algorithm was going to be and then I simply translated it in ASM. And when I had bugs in the ASM, I could simply compare and figure out where things went wrong. See attached image. If mesen could do this, it would be awesome.

One last problem I have is that in extreme conditions, like when going diagonally and being perfectly aligned in X and Y, and being on a frame where a full row and column of tiles AND attribute will load in, i will exceed the NMI cpu cycle limit by about ~400 cycles.

Since I am going to blank the top/bottom 16 scanlines, would it be possible to offload some of the PPU update work there? Like update the palettes there or part of the tiles/attributes? How common is this as a technique?

(I am also aware I could simply change my algorithm to, for example, just process 1 row or column per frame, but im too lazy to change that right now).

-Mat
Attachments
Scroll.png
User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Advice for artifact free 4-way scrolling

Post by Bregalad »

bleubleu wrote: One last problem I have is that in extreme conditions, like when going diagonally and being perfectly aligned in X and Y, and being on a frame where a full row and column of tiles AND attribute will load in, i will exceed the NMI cpu cycle limit by about ~400 cycles.
I'm fairly sure it should be possible to fit updates in VBlank, assuming you're talking about ONE row and ONE column of 8x8 tiles (not 16x16 metatiles).
You should use $2000.4 to your advantage when updating the nametable column; when updating an attribute table column this is more limited but you can still use this to your advantage knowing it will skip 3 rows, but you can still use 4 bulks of 2 bytes instead of 8 bulks of 1 byte.

So you should have the following:
  • Update a nametable row : Done in two bulks (because of vertical mirroring, you need to write to two screens), total of 32 bytes
  • Update an attribute table row : Done in one bulk of 8 bytes
  • Update a nametable column : Done in one bulk, total of 30 bytes (uses column mode)
  • Update an attribute table column : The most annoying, it has to be done in 4 bulks of 2 bytes. (uses column mode)
This means, in the absolute worst case, you have to write new address to $2006 8 times, and write 78 bytes of data to $2007. Assuming 4 cycles for load and 4 cycles for writing to the register, that's 8*(4+4+4+4) + 78*(4+4) = 752 cycles. Of course more cycles are needed for logic, etc... but this should be doable in VBlank without using any further tricks.
Since I am going to blank the top/bottom 16 scanlines, would it be possible to offload some of the PPU update work there? Like update the palettes there or part of the tiles/attributes? How common is this as a technique?
This technique is uncommon, but was made probably popular by the game Battletoads (and it's sequel Battletoads and Double Dragon) which are very popular among NESDevers. Personally unless I'd really need the extra blanking time, I'd rather hide them using either a blank CHR-ROM bank or by disabling the background only and having 8 high priority sprites at Y=0 hiding the real sprites, avoiding Battletoads-style forced blanking on the top of the screen and all the problems this creates.

Also: if you aim at great scrolling you should hide the top scanlines, not the bottom, because sprites can't be shown partially on the screen on the top of the screen, but they can on the bottom. Also turning sprites rendering off during the frame can cause erratic problems.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Advice for artifact free 4-way scrolling

Post by tokumaru »

If you turn rendering off at the top of the screen, as opposed to using blank tiles like Jurassic Park does, you can indeed use that time to keep accessing VRAM, but there are a couple of catches: Firstly, the NTSC dot crawl pattern will be different, because the variable PPU cycle at the beginning of the frame doesn't happen when rendering is off; Secondly, you don't get to use the MMC3 scanline counter to time the blanking area anymore, because it doesn't work when rendering is off. Sprite 0 hits are also not an option.

If you can deal with the slightly different appearance of the image (IIRC, Battletoads is like this, for example), and you have an alternate way to time the blanking area, then yeah, you can get quite a bit of extra vblank time.
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: Advice for artifact free 4-way scrolling

Post by psycopathicteen »

Are you using a zero page buffer?
User avatar
bleubleu
Posts: 108
Joined: Wed Apr 04, 2018 7:29 pm
Location: Montreal, Canada

Re: Advice for artifact free 4-way scrolling

Post by bleubleu »

You should use $2000.4 to your advantage when updating the nametable column; when updating an attribute table column this is more limited but you can still use this to your advantage knowing it will skip 3 rows, but you can still use 4 bulks of 2 bytes instead of 8 bulks of 1 byte.
Right now i split my stuff in 3 buffers which use different strides: 1, 8 and 32. 1 and 32 uses $2000 to avoid having to increment the address manually. The 8 byte one is for attributes and needs to be handled manually.

But you are right, I think will try to avoid using generic buffers (which needs loops/logic) and I will try to unroll them in common update scenario (like a full column, etc.) in order to minimize the update cost.
Are you using a zero page buffer?
No. Right, that should save a few cycles. I will look into that.
If you turn rendering off at the top of the screen, as opposed to using blank tiles like Jurassic Park does, you can indeed use that time to keep accessing VRAM, but there are a couple of catche
Thanks. I have a lot to learn...

-Mat
User avatar
Kasumi
Posts: 1293
Joined: Wed Apr 02, 2008 2:09 pm

Re: Advice for artifact free 4-way scrolling

Post by Kasumi »

You can also use the stack instead of a zero page buffer. (If you're not.) Then you don't need to do iny or inx (if you are). Just pla, sta $2007 X times.
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: Advice for artifact free 4-way scrolling

Post by psycopathicteen »

Wouldn't you need rows of 33 tiles instead of 32?
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Advice for artifact free 4-way scrolling

Post by tokumaru »

Bregalad wrote:I'm fairly sure it should be possible to fit updates in VBlank, assuming you're talking about ONE row and ONE column of 8x8 tiles (not 16x16 metatiles).
You can actually fit a lot in vblank depending on how optimized your code is. My engine can do both a column and a row of metatiles (i.e. 132 tiles) plus their attributes, along with a sprite DMA. I use completely unrolled code (i.e. no index increments or branches, which saves a lot of time) to barely fit this all in standard vblank time, and other types of updates (palettes, patterns, etc.) can only be done when the scrolling isn't taking all the time, but that's OK, because no game will ever scroll diagonally at 16 pixels per frame every frame, so there are plenty of opportunities for other types of updates.
Kasumi wrote:You can also use the stack instead of a zero page buffer.
The stack is slower, though. That being said, I do find it a bit difficult to take advantage of ZP's faster load time. If you use indexing, the speed advantage is gone (takes the same time as absolute indexed or PLA, which's 4 cycles), so you need unrolled code to load from constant memory locations, but since 8-way scrolling means that rows and columns are nearly always split across 2 name tables, that's not trivial. It can be done, but you have to be clever.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Advice for artifact free 4-way scrolling

Post by tepples »

tokumaru wrote:My engine can do both a column and a row of metatiles (i.e. 132 tiles) plus their attributes, along with a sprite DMA. I use completely unrolled code (i.e. no index increments or branches, which saves a lot of time) to barely fit this all in standard vblank time, and other types of updates (palettes, patterns, etc.) can only be done when the scrolling isn't taking all the time, but that's OK, because no game will ever scroll diagonally at 16 pixels per frame every frame
That infamous hill in Sonic the Hedgehog 2: Chemical Plant Zone act 2 is the exception that proves the rule.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Advice for artifact free 4-way scrolling

Post by tokumaru »

It's a good thing I'm not particularly fond of Chemical Plant Zone so I wouldn't want to design a level like it anyway. Still, full speed on both axes is way too fast, so if at least one of the axis is slightly slower than 16 pixels per frame, maybe 14 or so, there'll still be some opportunities for other types of updates.

Another thing that prevents this from being a huge problem is that when the screen is scrolling that fast, the lack of other updates is much harder for the human eye to notice, and if someone does notice, they'll slow to look at it and things will immediately go back to normal, and there'll be nothing to see! :wink:
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: Advice for artifact free 4-way scrolling

Post by psycopathicteen »

If you're using an unrolled loop, how do you jump across name tables?
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Advice for artifact free 4-way scrolling

Post by tokumaru »

The unrolled loop has several entry points, that you select based on the amount of tiles to transfer, and by using indexed addressing the index can be manipulated so the correct part of the buffer is read.
Sour
Posts: 890
Joined: Sun Feb 07, 2016 6:16 pm

Re: Advice for artifact free 4-way scrolling

Post by Sour »

bleubleu wrote:If mesen could do this, it would be awesome.
Not a bad idea, shouldn't be too hard to highlight tile/attribute modifications in the nametable viewer, I think - I'll add it to my list.
User avatar
bleubleu
Posts: 108
Joined: Wed Apr 04, 2018 7:29 pm
Location: Montreal, Canada

Re: Advice for artifact free 4-way scrolling

Post by bleubleu »

All right guys.

Thanks to all your advice I got my NMI running in < 1820 cycles all the times, even with crazy diagonal updates.

I unrolled all column loops, optimized the row (tile/att) updates, moved some stuff on ZP and everything works. My palette update loop wasn't unrolled, and not on ZP... shame on me. :(

It even simplified the X scrolling algorithm a bit.

Thanks!

-Mat
Post Reply