AtariAge "CPU comparison"

Discussion of hardware and software development for Super NES and Super Famicom.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
tepples
Posts: 22017
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: AtariAge "CPU comparison"

Post by tepples » Mon May 02, 2016 7:11 pm

Sik wrote:As for special effects: the Dragon Ball Z game uses [window] for vertical split screen (only one background plane for each side but hey it works!)
I wonder how hard it'd be to combine this with Team Player or 4-Way Play for a 4-quadrant split screen.

93143
Posts: 1194
Joined: Fri Jul 04, 2014 9:31 pm

Re: AtariAge "CPU comparison"

Post by 93143 » Mon May 02, 2016 7:16 pm

HihiDanni wrote:I actually thought about making a game using Mode 5 once! I'm not sure if the background depth trade-off is worth it though, and presumably the sprites would remain at standard resolution.
Sprites can do interlace just fine; there's a separate PPU flag for that. Horizontal resolution is normal, though.

Sik
Posts: 1589
Joined: Thu Aug 12, 2010 3:43 am

Re: AtariAge "CPU comparison"

Post by Sik » Mon May 02, 2016 7:17 pm

(EDIT: on the vertical split screen thing) Well, as far as I'm aware the SNES can do the equivalent thing and in an even more flexible way. Yuu Yuu Hakusho (the first one) does this.

Also just noticed this so before I forget:
Espozo wrote:But this would get rid of BG3, because there's not enough bandwidth. Then, you'd basically have the Genesis. I'd rather have an extra BG, even if only 2bpp, as looking on a TV that's displaying in the 4:3 aspect ratio, the difference between the two isn't that big.
Something like this would probably require a faster clock anyway, so that'd probably result in more bandwidth assuming the memory is fast enough. This is what happens on the Mega Drive, anyway (hence why its 256px mode has less sprites and a slower transfer speed).

psycopathicteen
Posts: 2937
Joined: Wed May 19, 2010 6:12 pm

Re: AtariAge "CPU comparison"

Post by psycopathicteen » Mon May 02, 2016 7:33 pm

@Stef
Are you trying to allocate sprites of different sizes, without wasting any space? If that so, than it looks like it would be more complicated with having 16 different sprite sizes instead of 2.

User avatar
Drew Sebastino
Formerly Espozo
Posts: 3503
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: AtariAge "CPU comparison"

Post by Drew Sebastino » Mon May 02, 2016 8:49 pm

Sik wrote:Something like this would probably require a faster clock anyway, so that'd probably result in more bandwidth assuming the memory is fast enough.
I mean, I'm assuming we're talking about sacrificing BG3 for more bandwidth for BG1, BG2, and sprites, which would allow they to display at 320 pixels wide. I'd imagine you'd have enough bandwidth sacrificing BG 3 to have sprites cover the whole screen horizontally, and BGs pretty much have to do that and would in this case, so in effect, you'd get how the Genesis distributes its vram bandwidth. SNES and Genesis seem about the same in this regard, the designers of the SNES valued BG layers more while the designers of the Genesis valued horizontal resolution more.
psycopathicteen wrote:@Stef
Are you trying to allocate sprites of different sizes, without wasting any space? If that so, than it looks like it would be more complicated with having 16 different sprite sizes instead of 2.
He's trying to go my crazy route? :lol: Well, at least the sprite tiles are in a straight line. It'd just be like seeing if a piece of whatever length would fit in this spot, and the possible lengths seem to be:

1 tile (1x1)
2 tiles (2x1, 1x2)
3 tiles (3x1, 1x3)
4 tiles (1x4, 4x1, 2x2)
6 tiles (3x2, 2x3)
8 tiles (4x2, 2x4)
9 tiles (3x3)
12 tiles (4x3, 3x4)
16 tiles (4x4)

So really, it's 9 sizes compared to 2, but I feel that the number is irrelevant at a certain point. I kind of wonder how you'd check to see if a sprite would fit, other than manually looking at each 8x8 tile to see if it's empty.

User avatar
tokumaru
Posts: 11766
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: AtariAge "CPU comparison"

Post by tokumaru » Mon May 02, 2016 9:27 pm

Espozo wrote:I kind of wonder how you'd check to see if a sprite would fit, other than manually looking at each 8x8 tile to see if it's empty.
There's got to be some sort of data structure to help with that...

psycopathicteen
Posts: 2937
Joined: Wed May 19, 2010 6:12 pm

Re: AtariAge "CPU comparison"

Post by psycopathicteen » Mon May 02, 2016 9:58 pm

You know what? I've done the 32x32 and 16x16 slot searching thing for a while now, but I still haven't done any duplicate checking. So far the "super long table" method seems the most compatible with my engine (even though it sounds silly), except for having to go back and giving every animation a starting index to the table. Its doable, but I want to save it for the next time I get sick with pnemonia.

User avatar
Drew Sebastino
Formerly Espozo
Posts: 3503
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: AtariAge "CPU comparison"

Post by Drew Sebastino » Mon May 02, 2016 10:08 pm

psycopathicteen wrote:So far the "super long table" method seems the most compatible with my engine
What's the "super long table" method? Just what we've been talking about? Yeah, there's next to no benefit using 16x16 and 32x32 if you're not checking for duplicates, at least if a large portion of sprite vram is using this setup and not static things.

psycopathicteen
Posts: 2937
Joined: Wed May 19, 2010 6:12 pm

Re: AtariAge "CPU comparison"

Post by psycopathicteen » Mon May 02, 2016 11:23 pm

Didn't you once mention having a list of how many of every animation frame are onscreen, and where they are in vram, if there are any?

Stef
Posts: 256
Joined: Mon Jul 01, 2013 11:25 am

Re: AtariAge "CPU comparison"

Post by Stef » Tue May 03, 2016 1:19 am

psycopathicteen wrote:@Stef
Are you trying to allocate sprites of different sizes, without wasting any space? If that so, than it looks like it would be more complicated with having 16 different sprite sizes instead of 2.
Actually that's exactly what does the current implementation of my sprite engine... At each frame update it reallocate all hardware VDP sprite and re-allocate VRAM if needed : searching if the TileSet (the name i gave to the tile data structure for a sprite) is already present in VRAM. For that i'm using a "TileSet cache" framework. Each TileSet has its own size of course as Sprite can have many different size on MD... so yeah that definitely consumes a lot of time in scanning all TileSet cache entries for each sprite... it's why i said the resource allocation was done in a lazy way :-/

I almost rewrote everything from scratch though now, i do not check anymore for duplicated TileSet entries as there is absolutely no interest in doing that. I explain : In fact you have 2 possibilities :
- share the same TileSet sprite data for several sprite --> static/fixed allocation of VRAM for that TileSet and sprite point to it statically.
- dynamic and free TileSet sprite usage --> dynamic VRAM allocation without worrying about duplicated entries.

Even if you can spare some VRAM by looking the duplicated entries for dynamic TileSet allocation, you have to consider that in the worst case you won't have any duplicated entries (each similar sprite are on a different animation frame) so anyway you have to get enough VRAM to store each sprite TileSet independently... so don't even bother about looking for duplicated TilesSet to optimize the VRAM usage here.

d4s
Posts: 92
Joined: Mon Jul 14, 2008 4:02 pm

Re: AtariAge "CPU comparison"

Post by d4s » Tue May 03, 2016 2:40 am

Stef wrote: At each frame update it [...] re-allocate VRAM if needed
Potentially reallocating VRAM for each animation frame sounds really excessive.
Performance aside, with higher reallocation rates, fragmentation tends to become more of a problem.
Also, it sounds like producing lots of arbitrary allocation misses once VRAM space gets tight, negating any possible benefit of temporarily shared tiles.

My personal preference is to scan all frames of all animation files for any given object/character (e.g. the player, an enemy etc.)
at compile time and allocate the maximum size of each sprite size once at object instanciation time.
If no VRAM space is left, I postpone or abort instanciation.
Frames are optimized to deviate as little as possible from each other for each sprite size to prevent wasting space.

To mitigate fragmentation, I allocate "big" sprite tiles from the bottom of sprite VRAM and "small" sprite tiles backwards from the top of sprite VRAM.
I feel this is making the best out of the only-two-sprite-sizes-concurrently-limitation of the SNES.

Apart from that, I agree that the twofold shareable static/individual dynamic allocation scheme really is the most sane and straight-forward way to go.

Stef
Posts: 256
Joined: Mon Jul 01, 2013 11:25 am

Re: AtariAge "CPU comparison"

Post by Stef » Tue May 03, 2016 3:07 am

d4s wrote: Potentially reallocating VRAM for each animation frame sounds really excessive.
Performance aside, with higher reallocation rates, fragmentation tends to become more of a problem.
Also, it sounds like producing lots of arbitrary allocation misses once VRAM space gets tight, negating any possible benefit of temporarily shared tiles.
Of course it is :p It's a really inefficient way of making thing work, but it was simpler to implement :) I wanted to provide a simple to use API sprite so i started it that way but quickly realized it was too much limited by the poor general performance of it :-/
My personal preference is to scan all frames of all animation files for any given object/character (e.g. the player, an enemy etc.)
at compile time and allocate the maximum size of each sprite size once at object instanciation time.
If no VRAM space is left, I postpone or abort instanciation.
Frames are optimized to deviate as little as possible from each other for each sprite size to prevent wasting space.
That is more or less what i'm doing now: I'm storing the maximum hardware VDP sprite in use and the maximum TileSet size informations for a sprite definition object then i use that to allocate these resources only once when you're creating / instancing your sprite object.
To mitigate fragmentation, I allocate "big" sprite tiles from the bottom of sprite VRAM and "small" sprite tiles backwards from the top of sprite VRAM.
I feel this is making the best out of the only-two-sprite-sizes-concurrently-limitation of the SNES.
Unfortunately that is a real issue on MD as you can have many different size and end up with many fragmentation in VRAM.
One of the solution is to try to avoid too much "live" sprite definition / allocation of different size or find the opportunity to release / reallocate everything at some point.
Last edited by Stef on Tue May 03, 2016 3:44 pm, edited 1 time in total.

Sik
Posts: 1589
Joined: Thu Aug 12, 2010 3:43 am

Re: AtariAge "CPU comparison"

Post by Sik » Tue May 03, 2016 11:45 am

And most games just statically assign slots anyway. Seriously, you lot are the only ones obsessed with the idea =P Most games actually would have those graphics compressed in ROM which makes streaming not much of a feasible option in the first place.
Espozo wrote:I mean, I'm assuming we're talking about sacrificing BG3 for more bandwidth for BG1, BG2, and sprites, which would allow they to display at 320 pixels wide. I'd imagine you'd have enough bandwidth sacrificing BG 3 to have sprites cover the whole screen horizontally, and BGs pretty much have to do that and would in this case, so in effect, you'd get how the Genesis distributes its vram bandwidth. SNES and Genesis seem about the same in this regard, the designers of the SNES valued BG layers more while the designers of the Genesis valued horizontal resolution more.
Again you're missing the important detail here: you can't use the same clock speed because 320 is not a multiple of 256 (unless you want to use a really fast clock which wasn't really feasible). What this means is that in a 320px mode you'd need to use a slightly faster clock speed to get smaller pixels, and the faster speed also means more memory accesses per line (i.e. more bandwidth).

The Mega Drive has a completely different problem, which is the fact it has slower memory altogether (all accesses have to be in bursts of four consecutive bytes, and it only has enough time to read two bytes per pixel). To be fair they could have probably gotten room for a third background plane if they got rid of the free slots (much like the SNES does), although some of those were reserved for memory refresh so that may have been an issue =O)

tepples
Posts: 22017
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: AtariAge "CPU comparison"

Post by tepples » Tue May 03, 2016 1:33 pm

Sik wrote:The Mega Drive has a completely different problem, which is the fact it has slower memory altogether (all accesses have to be in bursts of four consecutive bytes, and it only has enough time to read two bytes per pixel).
All rendering accesses on the Super NES outside the mode 7 background occur in bursts of the two bytes that make up a word, also two bytes per pixel. So that's a wash.

Educated guess of the fetch pattern over the course of 16 pixels, at 2 pixels per 4-byte burst:
  1. BGA/window map (2 cells)
  2. BGB map (2 cells)
  3. BGA/window sliver for left tile of pair
  4. BGB sliver for left tile of pair
  5. sprite fetch?
  6. refresh?
  7. BGA/window sliver for right tile of pair
  8. BGB sliver for right tile of pair
Is that close?

psycopathicteen
Posts: 2937
Joined: Wed May 19, 2010 6:12 pm

Re: AtariAge "CPU comparison"

Post by psycopathicteen » Tue May 03, 2016 1:53 pm

Sik wrote:And most games just statically assign slots anyway. Seriously, you lot are the only ones obsessed with the idea =P Most games actually would have those graphics compressed in ROM which makes streaming not much of a feasible option in the first place.
Yes, most people are perfectly fine with giving each enemy only 4 frames each.

Post Reply