- For making cartridges of your Super NES games, see Reproduction.
The next step to improve looks, without calculating shadows at a higher res, would be a separated blur pass. Given each tile and its two neighbors vertically/horizontally, use a LUT to pick a new tile with a suitable pattern.
oh, neat idea! I'll add spiders too, and I'll make sure they'll have lots of HPYou actually plan to add combat and enemies to the game? I hope we won't have to fight lots and lots of small flying mammals that are hard to see and hard to actually hit.
Oh, it is using color math already. All the lighting is in a separate layer.If you want shadow casters, color math would still work and look better. You no longer need N versions of every bg tile, merely N grey level tiles for the color layer. Ie you no longer edit the main tile layer, you edit the color/"shadow" layer.
It is using mode1, both 4bpp layers are used for backgrounds, the 2bpp layer is used for lighting. I could probably switch things around and use one of the 4bpp layers instead, that would reduce the dithering quite a lot. This is something I am still thinking about, as I am redoing all the assets now anyways. The downside is that I really want to use both 4bpp layers for background, as I want to do some compositing stuff for the backgrounds and use the same tileset for both backgrounds, I don't know if that would work out well if one of them was 2bpp. Probably might work with carefully chosen palettes. I don't know if it'll help much with the looks of the lighting however. If the dithering goes away, it could even make it look worse because the blockiness becomes more articulated. I also considered switching to mode0 and do everything in 2bpp. I rarely use more than 6 or 7 colors in one tile anyways, so if I am a bit more careful about how I make the tile sheet, that could work too, especially considering that there are more options for compositing there. I am really kind of undecided about this. If anyone has some good ideas on this, I'd appreciate it very much.
Just as long as the protag isn't such a coward as to be so damaged by some palm-sized arachnid. :p
That's a good idea. Not that many SNES developers do mode 0 because they're too afraid that their artwork is going to suffer in 2BPP. I really would love to see a game that pushed layer usage, not just for parallax, but for things like shadowcasting and such.none wrote: ↑Fri Sep 04, 2020 10:53 amI also considered switching to mode0 and do everything in 2bpp. I rarely use more than 6 or 7 colors in one tile anyways, so if I am a bit more careful about how I make the tile sheet, that could work too, especially considering that there are more options for compositing there. I am really kind of undecided about this. If anyone has some good ideas on this, I'd appreciate it very much.
How did I not notice this shadowcasting the first time I played this demo? I also didn't notice that you could grab certain pieces and place them in other spots.
You say you're using a single palette for the light layer because you can't spare the DMA to do both bytes, so you just write the bottom byte?
It should be possible to just write the top byte. That gives you palette selection and a choice of four different tiles, which for the whole layer gives you 24 colours plus transparency (assuming you're sticking with 2bpp).
You'd lose the gritty NES-like feel of the dither by just using solid tiles, but it might look better in some ways, particularly (IMO) if you clustered the shades of gray artistically instead of just Bresenhamming from white to black. Or you could just not use all 8 subpalettes, and go with 16 or 17 shades for more uniform spacing. Alternately, you could use 17 shades from white to black and have each subpalette overlap the next one (ie: the brightest shade in one palette is the darkest shade in the next), and use two solid tiles and two dither tiles to approximate 32 more or less evenly spaced light levels.
On the other hand, with a tileset that small you'd lose any semblance of directionality within a tile. (Which, to be honest, doesn't really come through in the actual game much anyway.) It could be somewhat restored (at least if you're still using dither to some degree) by using a partly baked tilemap with the bottom byte acting as a tileset selector for different regions of the screen, but this would require the lightmap to be fixed to the player position instead of scrolled with the foreground layer, and you may not want to do that. I suppose you could just use a subtle 45° dither pattern and flip the tile depending on where it is, but I don't know if that would look any better. If you had the bandwidth to update a number of rows and columns of the bottom byte each frame in a grid pattern, it might be possible to use a technique similar to map scrolling to update the bottom byte to ensure that the dither pattern is appropriate to the screen position...
Pardon me; I'm rambling. It's an interesting problem, and I got thinking.
Thats a nice idea. I'd have to remap the current format somehow so that the right bits are set, don't know if that would be viable. Sounds expensive, but not too expensive, I'll have to look into it, it might work. The problem with this kind of solution is, that at the moment, the buffer that contains the lighting info in ram can also be used to compute the lighting info for the next tile, because i can use the full byte for data.It should be possible to just write the top byte. That gives you palette selection and a choice of four different tiles, which for the whole layer gives you 24 colours plus transparency.
I'll try to expand a bit on this, excuse the wall of text... at the moment, the assembly for computing the lighting is a more or less a long unrolled loop that is arranged in such a way that the tiles that are nearer to the light source always are evaluated first. For a single tile it looks like this (for most of the tiles, some can use a shorter replacement):
Code: Select all
lda a:light_origin + $0521 ; source tile xba lda a:light_origin + $0541 ; target tile tax lda f:vistable + $0000, x sta a:light_origin + $0541 ; target tile
With this solution, I cannot abuse the VRAM for a free table lookup (by having "duplicate" tiles). That would require a second buffer, and a second pass for creating that. Would probably take another few thousand cycles (for around 1000 tiles it is about 500 lookups and stores, it can probably be done with 16bit instructions with a 64k LUT), but might still be reasonable.
This step might be skippable if you don't care about the mirroring / priority bits (idk if I can force the lighting layer to the front regardless of the bit), but that would reduce the number of available final brightness levels because the "fake LUT" would need to be done via the available tile/palette choices, so nothing gained this way, I guess.
Another problem might be that it would need a full 1024 tileset with lots of wasted space in it, but that should be easily avoidable, for example by just avoiding using one of the tiles in another tileset. About the palette selection for this, it can only support 16 levels of brightness by design.
There is no real implementation of directionality atm. Doing something like this is too expensive i think, at least I cannot imagine a way it could be done reasonably fast.On the other hand, with a tileset that small you'd lose any semblance of directionality within a tile.
Oh and thanks everyone for nice feedback.
You should be able to compile the demo rom from the sources, although the build process is still unnecessarily complicated, I will improve that later.
This is the state of the project as it was at the point the demo was built, with some missing documentation added. I will also improve the documentation later, i don't know how readable the whole thing is, although I did try to tidy it up as much as I can.
Let me know if you would like to build it but have any problems / it doesn't work.
The game does do a couple of signed right shifts: CMP 8000h, RCR repeated 5 times. There are better ways for that. The best might be this: http://forum.6502.org/viewtopic.php?p=5 ... d52c#p5246 ie. 5xLSR, CLC, ADC FC00h, EOR FC00h.
Is BGMODE 0 with 2bpp really a good idea? You'd need to change 6 or 7 colors per tile to 3 colors per tile (but yes, you could use the extra layer to regain 6 colors, which isn't really much of an improvement) (or you could stick with 3 colors and add an extra scrolling layer with clouds/fog, though you already have the snow flakes for that).
When switching between different realities, it would be nice to have a different tile set for each reality (or at least different palette). Like showing the building with furniture & decorations, and the same building empty & in ruins.
Wondering a bit more about raising shadow resolution from 8x8 to 4x4, it would require 4x more calculations, but the SNES is quite fast, faster than old 8bit computers, and 4x4 resolution updates are still faster than bitmap graphics. Merging four brightness values into one tile isn't difficult: Just store the brightness values in the lower bits of the tile number (eg. if there are only 2 brightness levels per tile: Store the low bit of the four brightness values in bit0,1,2,3 of the tile number, and the course brightness in upper bits).
Yeah, there are a few places that aren't really optimized well. I thought as long as I am still changing around stuff, there's no need to make the code any more complicated if it isn't really in a tight loop...The game does do a couple of signed right shifts
About the "dark world" stuff, those are great ideas. I will probably throw away the snowflakes or just use them in a few places, so a fog layer could be working well, at least in a few areas. Also a layer could be used for parallax backgrounds in other places.
The good thing about the mode 0 palettes is that I will have more options for combining different palettes. Atm, every bg palette is there twice (a bright and a dark version) so that I can have things in the background seem darker. With more smaller palettes, I could get away with doing that only for some of the palettes, which would maybe result in more colors I could use at the same time in total in the end. And for most of the tiles I make, I've noticed I can get away with 3 colors, if I am a bit more careful. For the few exceptions, compositing can still be used. Swapping out tilesets will become faster also.
It's been a while since I've last worked on this, so many things are a bit foggy to me... but from what I remember there's time for about 40000 or 50000 cycles per frame (excluding vblank), depending on what you are doing with the cpu. The lighting, as it is now, needs about 15000 or so iirc. This is just below the treshhold of the budget I was willing to dedicate to that, because there still needs to be time left to do the actual game logic / scrolling updates etc. The more expensive stuff would work maybe if lighting updates would be done only every second frame, I'll see if there is time left when I have most of the other features finished.
You don't care about the flip bits if you're using solid tiles, and you don't care about the priority bit because the lighting layer should be the only thing on the subscreen.
It occurs to me that with 8 palettes and 16 light levels, you only actually need two unique tiles. This frees up a bit in the tilemap byte (you'd still have four tiles, but they'd be two identical pairs). So instead of your rrrrllll format (if I understand correctly), your final VRAM format would have to be rrrpppcr (or rrrppprc).
[r = distance from light source, l = raw light level, p = palette, c = character (ie: tile)]
It appears to me that your lighting calculations are baked into a single 64 KB hash table (and the accompanying unrolled loop) that takes the rrrrllll data from the source tile and propagation parameters from the destination tile and calculates an rrrrllll value for the destination. And if I understand your "free lookup table" in VRAM correctly, the raw 4-bit light level cannot be recovered uniquely from the onscreen shade even knowing the distance value. (I may have the distance and light level bits mixed up, but bear with me.)
Question: Does the hash function actually need the original 4-bit lighting value, or could it make do with the final onscreen tile shade (given that it knows the distance value)? If the latter, perhaps the vistable could be rejiggered to use distance and final level rather than distance and raw level, thus combining both lookups into one.
If not - and I suspect not - I see no way other than a second lookup (or changing the method).
Have I got hold of the right end of this?
There's no rule that says different data regions can't overlap. You could store tilemap and sprite data in the unused area, or (as you suggested) tile data for a different layer. Using less VRAM to do the same job is never bad.Another problem might be that it would need a full 1024 tileset with lots of wasted space in it
I guess I got overenthusiastic and misunderstood. Never mind then; no need to solve that problem...There is no real implementation of directionality atm.
I had a comment here about smooth changes to tile shading, but I think it was based on a misunderstanding. I may take a closer look at the source and try to figure out if the general idea is still feasible.
Even if I still have dithering, the flip bits wouldn't matter that much because it doesn't matter which way the dithering pattern is oriented.You don't care about the flip bits if you're using solid tiles, and you don't care about the priority bit because the lighting layer should be the only thing on the subscreen.
About the priority bit, I have never tried it out, but I always thought that if I don't have the priority bit set in the sub screen, but it is set in the main screen, that part of the main screen would show through? Wasn't that even a thing with some games?
About the distance from origin part I was inaccurate / oversimplifying before, really it means "amount of shadow already accumulated" in the source tile and "amount of shadow this tile should cast" in the target tile. Again excuse me, it did remember some things wrong about it when I said that because its some time ago already. The other four bits are pre-baked brightness in the level, yes. I do need to keep the upper four bits intact, the lower 4 bits from the source tile can be thrown away for purposes of the calculation of the next tile.
So, it seems you are right, the "VRAM table lookup" thing is a bit overdone - it does give some variation, but it isn't really needed. My other option was to bake the result (combined high and low bits) into the lower four bits and mask off the high bits before copying to vram to fit everything into one row (which would still require the masking part - it needs to be done in a separate step because those higher four bits would be exactly those that cannot be destroyed for the shadow casting to work). So you are right, the second step is still needed - but it doesn't need to be another lookup, just masking off the bits, so your approach of arranging around those bits to places where they are not important in the high byte, can work.
Also, I've been playing around a bit with it now again, and it came to my mind that with my new assets and the better palette choices in mode 0, having the prebaked light in the levels isn't really that important anymore because I have some variation there just with the backgrounds themselves. It would still add a bit of visual flavor, but I can live without it. So if i scrap that part, I can do the entire calculation just with the "shadow" bits. That makes the high-byte / palette approach entirely viable. I can even add another bit to have 32 levels in total, like you suggested previously. And I can have the unused 3 bits set like is needed (which would solve the priority bit problem, if it is the way i thought, and orient all the dithering patterns in the correct way).
That will perhaps work, thanks for sticking with this. I'll try it out tomorrow and post some screens if it works.
With the above solution (without the "light" part), combined with the "fake mask", maybe I can even scrap the LUT altogether and make the lighting loop faster, like this
Code: Select all
lda a:light_origin + $0521 ; source tile clc adc a:light_origin + $0541 ; target tile inc ; in some of the tiles to decrease light level, was previously baked in table lookup sta a:light_origin + $0541 ; target tile
Edit: oh wait, the inc instruction does set the N flag. If I set it up just so that the range of the 32 shadow values sits in 224..255, i might be able to just do bpl after the inc and even get an early-out out of the loop and fill the rest with black with a faster method. Which coincidentally happens to always set the priority bit.
Sadly, the destination tile does not have the data from the old frame anymore because the whole buffer is reloaded from rom before the calculation starts. It could still work with double buffering and adding another lookup into the previous buffer.It does occur to me that since the light level of the destination tile is an input into the hash table...
No, I don't think so. Each layer (BG1/BG2/BG3/BG4/OBJ) can be sent independently to either the main screen or the sub screen, or both. The two screens are composited separately, and then mathed together. I'm pretty certain tile priority only affects the compositing step, and if there's only one layer on the subscreen there's nothing for that bit to do. So if you simply don't send the solid layers to the subscreen, it doesn't matter what their relative priority is; they will never show up over top of the lighting layer on the subscreen because they aren't on the subscreen at all.
But what do you need the X register for? Clearly your method works as it stands...Scrapping the lookup would be very nice because it would free up the X register.
Yeah, I noticed that and removed my comments.Sadly, the destination tile does not have the data from the old frame anymore because the whole buffer is reloaded from rom before the calculation starts. It could still work with double buffering and adding another lookup into the previous buffer.
I have a bad habit of repeatedly editing my posts after posting them. With how long it takes me to compose a post, you'd think I'd be satisfied with it before hitting Submit...
You have a paragraph about that in the fullsnes docs
I also remember reading of games that use that behaviour.Color Math occurs only if the front-most Main Screen pixel has math enabled (via 2131h.Bit0-5.), and only if the front-most Sub Screen pixel has same or higher (XXX or is it same or lower -- or is it ANY priority?) priority than the Main Screen pixel.
Your docs have been an invaluable reference with coding this, anyways. Thanks very much for those.
I'll make some tests and let you know of the results.
Only thing that matters is the top-most main screen pixel, and top-most sub screen pixel
(but doesn't matter if the main screen is in front of sub screen, or vice versa).
Intuitively you would want to have the transparent layer in front of the other layer, but the color maths hardware doesn't seem to insist on that.
And, there are more "per-BG-layer" priority bits elsewhere anyways, the "per-tile" priority flag is only needed for fine-tuning.
EDIT: That is, only for "BG3 Priority in Mode 1" (in Port 2105h).