adding a skybox background would cost much more than doing a quick clear buffer loop.
Actually, I'm not sure this is true, except in 2bpp.
In 8bpp, or 4bpp if you abuse the dither feature, you can pull data from ROM at (nearly) the same rate the plotting hardware can send it to RAM (which is as fast as anything can send data to RAM, because if you're filling slivers it doesn't have to do any reading), and since they're on separate buses this can happen in parallel. Aside from linefeed, which for a simple 2D backdrop should be fairly lean, the only hiccup is the necessity of putting inc R14
, which if I'm understanding the timing correctly should make a 2D skybox about 20% slower than simple zero fill in the ideal limit, or up to 25% faster than SNES DMA.
Granted, this would work best with an uncompressed bitmap in ROM, which may not be how you want to roll. Using tiles would be slower, especially if they weren't horizontally aligned to the tile grid in the framebuffer... On the other hand, it might be reasonable to fill part of the backdrop with solid-coloured lines, which would be very comparable to zero-fill.
But like I said, unless you're using 8bpp there's really no point because the S-PPU can do a better skybox. In addition to the ability to use a different BG palette, or even multiple palettes, it's possible to use Mode 2 and get backdrop rotation almost for free. Depending on what you're doing, it may in fact be possible to use a Mode 7 backdrop, though you might have to watch your VRAM usage because of the way the map works, and care is necessary not to overload the sprite system because that's the only way to get a second layer in Mode 7...
How possible would it be to have 64 colors via a 4bpp layer and a 2bpp layer with color math (possibly even just varying shades of gray with color subtraction)? I think that would be a good compromise between the number of colors and performance, but I don't know if having two buffers like that will actually hurt performance even if it reduces the amount of data you have to transfer to vram.
Well, you'd have to draw everything twice, and since the Super FX can't change the screen base register by itself this restricts the size of the rendered window. Also, switching between 4bpp and 2bpp on the fly would confuse the plotting hardware, so you'd have to be rpix
ing all the time, and then there's the overhead for switching modes and repositioning for every line (or doing the higher-level render prep twice, which might actually be faster).
It's true that 8bpp mode does afford enough space for a full screen of 4bpp and
a full screen of 2bpp. Not only would fill rate be appalling (16bpp), but the rendering technique would have to be really weird; you can't just leave 32-byte gaps in your DMA source...
You could also forget about using plot
and just render to bitplane format in software. I'm not sure this is a better idea, but we know it's possible in principle...