Max colour output

Discussion of hardware and software development for Super NES and Super Famicom. See the SNESdev wiki for more information.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: Max colour output

Post by Drew Sebastino »

The SNES ones are the two on the bottom; the bottom left is with 8 channel HDMA, the one in the bottom right is without.

Fantastic job by the way; it looks closer to the 15bpp version for the most part. The shadow of the face is the only thing that looks noticeably poor. With the 8bpp version, I'd take the palette given and sacrifice some other color in order to have an extra shade of brown. It's a shame you can't really do that with the HDMA image.
CypherSignal
Posts: 34
Joined: Sun Jul 22, 2018 2:36 pm

Re: Max colour output

Post by CypherSignal »

Drew Sebastino wrote:With the 8bpp version, I'd take the palette given and sacrifice some other color in order to have an extra shade of brown. It's a shame you can't really do that with the HDMA image.
Yeah, it is ironic that the HDMA allows for a ton of little details and small colours to come out, meanwhile huge splotches of colour banding are left relatively unfettered, only improving through more headroom provided via the HDMA stuff. Across the rest of the Kodim examples, sky gradients and other face shots tend to suffer as well. The easiest solution is probably just throw some simple dithering in there late in the process and call it a day, but for my own purposes, it's not a priority anyway - the next big target in my sights for this is to support 16c tiled output for mode 1/3 display, similar to what Khaz discussed in the thread that 93143 linked to, but with improved processing time and expanding usable color palettes through HDMA.
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: Max colour output

Post by Señor Ventura »

Drew Sebastino wrote:The SNES ones are the two on the bottom; the bottom left is with 8 channel HDMA, the one in the bottom right is without.

Fantastic job by the way; it looks closer to the 15bpp version for the most part. The shadow of the face is the only thing that looks noticeably poor. With the 8bpp version, I'd take the palette given and sacrifice some other color in order to have an extra shade of brown. It's a shame you can't really do that with the HDMA image.
So, we are talking about more KB's of VRAM than the dedicated to backgrounds. It has to use sprite too, ¿right?.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Max colour output

Post by lidnariq »

CypherSignal wrote:the next big target in my sights for this is to support 16c tiled output for mode 1/3 display, similar to what Khaz discussed in the thread that 93143 linked to, but with improved processing time and expanding usable color palettes through HDMA.
One my first idle thoughts in response to your first post was wondering how just seeing how a 512x224 16-color image (i.e. the simpler problem of ignoring all palettes beyond the first) would work out.
CypherSignal
Posts: 34
Joined: Sun Jul 22, 2018 2:36 pm

Re: Max colour output

Post by CypherSignal »

lidnariq wrote:One my first idle thoughts in response to your first post was wondering how just seeing how a 512x224 16-color image (i.e. the simpler problem of ignoring all palettes beyond the first) would work out.
16-colours across the entire image? It's fairly unremarkable. Across the entire kodim set, there are zero to two colours added through any use of HDMA. e.g.

Image Image

The way it's set up right now, with only 16 buckets of colours, they will all likely cover a very large portion of the screen and have overlapping scanline coverage, and there will be few opportunities for a colour to be evicted, and few slots for colours to jump back in.
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: Max colour output

Post by Drew Sebastino »

@Señor Ventura What are you trying to say? Both SNES images take the same amount of memory in VRAM; 56KB excluding the tilemap. Backgrounds can use 1024 unique tiles, which for an 8bpp background, is the entire 64KB of VRAM.

The problem I've seen with image palettization (?) programs is that they don't distribute color based off screen area well enough, or something like that. You'll see a small object get as many palette entries dedicated to it as an entire backdrop because it has more contrast. I'm not really sure how it works.
93143
Posts: 1718
Joined: Fri Jul 04, 2014 9:31 pm

Re: Max colour output

Post by 93143 »

CypherSignal wrote:
93143 wrote:Obviously it would be better to have a tool that simultaneously optimizes the image and schedules the HDMA, but that seems like a lot of work...
I actually got a pretty decent processor going with several days of work that does that harmonious quantization you're wondering about.
Well, this is certainly a nice surprise to wake up to...
If anyone is interested, I'm willing to post the source code up somewhere for perusal.
I'm interested. Though I'm pretty busy right now, so I probably won't get to it right away...
CypherSignal wrote:The easiest solution is probably just throw some simple dithering in there late in the process and call it a day
You'd have to reference the 24-bit original, I assume.

I wonder if the way the SNES handles the video signal would cause issues with extensive checkerboard-type dither, as is sometimes seen in NES games; perhaps a more random style like Floyd-Steinberg might produce better results...
CypherSignal wrote:16-colours across the entire image? It's fairly unremarkable. Across the entire kodim set, there are zero to two colours added through any use of HDMA.
Hmm. I wonder if there's a way around this...

Aside from the full-screen 8bpp multichannel stuff, I've got a bunch of smaller high-colour images that need to look good in single-palette 4bpp with only one HDMA channel, and iterative static quantization is not an ideal solution when resources are that thin. I've done a small one-channel 4bpp image by hand/eye, and it works great (32 unique colours in a 28-line range, and looks lovely), but as one might imagine it is somewhat tedious, and I wasn't looking forward to doing a whole lot more of it...
Drew Sebastino wrote:The SNES ones are the two on the bottom
I suspect you're using a narrower screen than some of us. I see all four in a row.
lidnariq wrote:One my first idle thoughts in response to your first post was wondering how just seeing how a 512x224 16-color image (i.e. the simpler problem of ignoring all palettes beyond the first) would work out.
Although I haven't actually tried it, it certainly seems like you should be able to use ordinary DMA to change out a whole 4bpp palette each line. You might have to watch the timing, maybe even use a jitter reduction technique in the H-IRQ if you need to run code at the same time, but it should fit.

On the other hand, if you were to quantize each line separately it might look funny because the lines would be uncoupled... How did DreamGrafix handle this?
Last edited by 93143 on Wed Sep 26, 2018 4:01 pm, edited 1 time in total.
CypherSignal
Posts: 34
Joined: Sun Jul 22, 2018 2:36 pm

Re: Max colour output

Post by CypherSignal »

As an aside, on a lark I also fed the Wii kids image you had through my processor, and got this:

Image

...which looks okay. It's similar to what your script generates - arguably worse because of some large gradients getting messed up (e.g. the wall on the left and the table in the background have more visible bands than your versions), but funnily enough, the final tally for # of colours ended up being a new record across anything I tested: 1265 (PSNR of 32.12 dB, fwiw)

...but that was so many colours that it identified a mild issue where, because I'm not doing any compression, the total binary data size of the map, clr, tile, and 8 channels of HDMA data being thrown at the assembler totals up to 65.1KB, so it can't fit in one bank anymore :lol: Thankfully an easy fix because it's easy enough to cap the number of buckets to a lower number, but I was a bit stunned because I never bothered running the math of, "what's the total binary size of everything maxed out?"
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: Max colour output

Post by Señor Ventura »

Drew Sebastino wrote:@Señor Ventura What are you trying to say? Both SNES images take the same amount of memory in VRAM; 56KB excluding the tilemap. Backgrounds can use 1024 unique tiles, which for an 8bpp background, is the entire 64KB of VRAM.
But the ppu1 don't give more than 45KB for backgrounds, so, i don't get why it can use 64KB for a background :?:
CypherSignal
Posts: 34
Joined: Sun Jul 22, 2018 2:36 pm

Re: Max colour output

Post by CypherSignal »

Well, did the big optimization I wanted to do, so processing is back down to ~200ms per image at 256c/8hdma. There's still a couple other big hotspots of activity that could be tightened up, but I'm pretty okay with it as-is.

If anyone wants to look over the code or try it yourself, it's hosted at https://github.com/CypherSignal/background-processor (and the code you're probably interested in the most is over in https://github.com/CypherSignal/backgro ... rocess.cpp )

I've got some other things I have to take care of so I won't be getting back to it to work on Background Mode 1/2-style output for awhile.
Señor Ventura wrote:But the ppu1 don't give more than 45KB for backgrounds, so, i don't get why it can use 64KB for a background :?:
Hmm, I'm not sure where you ever got 45KB from. The total RAM available for the PPU is 64KB in size, and all background, sprite, and tile data can be addressed in that space without any restriction.
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: Max colour output

Post by Señor Ventura »

CypherSignal wrote:Hmm, I'm not sure where you ever got 45KB from. The total RAM available for the PPU is 64KB in size, and all background, sprite, and tile data can be addressed in that space without any restriction.
I've read from some programmers that snes has an amount of memory for sprites, and an amount of memory for backgrounds, so it is preselected.
CypherSignal
Posts: 34
Joined: Sun Jul 22, 2018 2:36 pm

Re: Max colour output

Post by CypherSignal »

Señor Ventura wrote:I've read from some programmers that snes has an amount of memory for sprites, and an amount of memory for backgrounds, so it is preselected.
Ah. That information may pertain more towards game-specific memory budgets, then - if you were doing modifications to an existing game like Super Mario World, that would be useful information. Tile data for sprites and backgrounds have to share the same 64KB of space, and for all intents and purposes, a developer would not want to mix the memory used for sprites and bg's. So, a developer would have to make a choice at some point to declare that they want to use some portion of the available VRAM for sprites (or some types of sprites - player sprites make have more memory allocated than sprites for enemies, for example) and some portion for backgrounds.

However, the examples I'm talking about here are independent of any game, and are just focused on displaying a single image. In this case, the entire VRAM is available to work with, and there are no restrictions on how memory can be allocated one way or another.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Max colour output

Post by lidnariq »

CypherSignal wrote:16-colours across the entire image? It's fairly unremarkable. Across the entire kodim set, there are zero to two colours added through any use of HDMA. [...] The way it's set up right now, with only 16 buckets of colours, they will all likely cover a very large portion of the screen and have overlapping scanline coverage, and there will be few opportunities for a colour to be evicted, and few slots for colours to jump back in.
Hm, that's disappointing.
93143 wrote:On the other hand, if you were to quantize each line separately it might look funny because the lines would be uncoupled... How did DreamGrafix handle this?
The last time tomaitheous asked, I came up with this quick and dirty hack and he was rightly disappointed in it.

On the other hand, coming back to this after another 3 years I can see trivially how to do subpalette generation more easily (namely, don't use ppmquant, instead use pnmcolormap on the colorspace-reduced reference, and use pnmremap from the highcolor original, with floyd-steinberg dithering)

Comparing this:
rgb9si.png
rgb9si.png (14.86 KiB) Viewed 5711 times
to the above linked attempt: functioning dithering makes horizontal banding much less obvious. (Simulated DAC here has 9bpp, just like the PC Engine).

And here's the same technique applied to kodim23:
#23
#23
(Image was simulated at 512x224, 15bpp, scaled vertically nearest-neighbor; scaled horizontally cubic)
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: Max colour output

Post by Señor Ventura »

CypherSignal wrote:
Señor Ventura wrote:I've read from some programmers that snes has an amount of memory for sprites, and an amount of memory for backgrounds, so it is preselected.
Ah. That information may pertain more towards game-specific memory budgets, then - if you were doing modifications to an existing game like Super Mario World, that would be useful information. Tile data for sprites and backgrounds have to share the same 64KB of space, and for all intents and purposes, a developer would not want to mix the memory used for sprites and bg's. So, a developer would have to make a choice at some point to declare that they want to use some portion of the available VRAM for sprites (or some types of sprites - player sprites make have more memory allocated than sprites for enemies, for example) and some portion for backgrounds.

However, the examples I'm talking about here are independent of any game, and are just focused on displaying a single image. In this case, the entire VRAM is available to work with, and there are no restrictions on how memory can be allocated one way or another.
No, no, i mean that i've read from programmers that snes has a delimited and preselected memory for sprites and backgrounds.

It was a demo of the gunstar heroes for snes, and the programmer said that it could be impossible in that machine because the original game gets more memory for sprites than the snes dedicates.

P.D: The demo seems to not be in youtube anymore.


edit: Sorry for the off topic, it only was to commenting that.
93143
Posts: 1718
Joined: Fri Jul 04, 2014 9:31 pm

Re: Max colour output

Post by 93143 »

Señor Ventura wrote:No, no, i mean that i've read from programmers that snes has a delimited and preselected memory for sprites and backgrounds.
It doesn't. You misunderstood, or else the person who said it was wrong.
It was a demo of the gunstar heroes for snes, and the programmer said that it could be impossible in that machine because the original game gets more memory for sprites than the snes dedicates.
The SNES allows 16 KB for sprites at any one time. That's in two 8 KB chunks that can be anywhere in VRAM (in fact you can change where they are between frames or even between scanlines), and they can overlap BG data.

[Was this a Treasure programmer or just a random dude? I'd want to see an example of what they were talking about before I conclude that Gunstar Heroes on SNES was really impossible. 16 KB isn't exactly tiny (it's over half a screen of completely unique tiles, equivalent to 20 KB on a Mega Drive in H40 mode), and a lot of situations where you'd run out of room are amenable to workarounds.]

Each BG layer gets 1024 tiles (which is 16, 32, or 64 KB depending on bit depth) which are contiguous in VRAM and can be placed anywhere. Tileset regions for different layers can overlap one another, just like sprites can overlap BG data, and all of this can overlap tilemaps. There are no mutual exclusion restrictions in SNES VRAM mapping, and all of it is controlled by writable registers instead of hardwired.

The only exception to the no-hardwiring rule is Mode 7, where the interleaved graphics/map data is hardwired to the bottom of VRAM and cannot be moved. This is why it's tricky (though IMO not impossible) to do a 2-player F-Zero game - Super Mario Kart just places both players on the same map, but F-Zero moves too fast to just use a static map for the whole race without zooming in too far and making it stupidly chunky and wobbly. But even with Mode 7, nothing says you can't use part of the Mode 7 region for sprites - I do this in my shmup port because I'm using 40 KB of mixed sprite and Mode 1 tileset/tilemap data (in fact I have to switch sprite data locations partway down the screen, which is how I know it's possible) and Mode 7 covers the entire bottom 32 KB of VRAM. Just gotta make sure the part of the map you've repurposed doesn't show up on screen...
CypherSignal wrote:...which looks okay. [...] arguably worse
Some of the bigger nearly-uniform areas are pretty bad, but you really nailed that bottle of Mountain Dew.

I wonder if this is partly due to differences in the underlying quantization methods... I spent a while fiddling with Color quantizer to get a good result on that image. Among other things, it wanted to not bother with the bright yellow triangle...
CypherSignal wrote:If anyone wants to look over the code or try it yourself, it's hosted at https://github.com/CypherSignal/background-processor
Thanks. Like I said, I'm a little swamped, so it might be a while before I can look at it.
lidnariq wrote:On the other hand, coming back to this after another 3 years I can see trivially how to do subpalette generation more easily (namely, don't use ppmquant, instead use pnmcolormap on the colorspace-reduced reference, and use pnmremap from the highcolor original, with floyd-steinberg dithering)
I understood some of those words... I'm still very userspace on this topic, but I'm sure it would make more sense if I read up on those resources.

But it does seem to work okay. You can still see some artifacting in 15-bit, but it really looks nice for 4bpp. Once I get some free time I should see if I can get the H-IRQ/DMA method running on real hardware, if no one's gotten to it before me.
Post Reply