Alternate application of SuperFX

Discussion of hardware and software development for Super NES and Super Famicom.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
Post Reply
smkd
Posts: 101
Joined: Sun Apr 22, 2007 6:07 am

Alternate application of SuperFX

Post by smkd » Wed Oct 06, 2010 10:06 pm

I think it was TmEE who brought up that Galaxy Force was ported to the Genesis at some point. That's really surprising considering you'd need a coprocessor on the cart to get something that actually resembles the original arcade release. After checking it out, it looks about as lame as I expected.

I'm curious what other people here think about that scaled-sprite graphical style in the video. If starfox ran at 20FPS using the graphical style seen in that vid, with a few less colors, would you find it nicer looking? The style might seem very dated to most, but I really like how the cave sections and fire stage turned out to name a few. From a 1992/1993 marketting perspecitve the true 3D would likely be alot more appealing, but consider this in the present day. What style do you find more appealing?

The 21Mhz SFX should be able to do a decent amount of drawing since cache/ram/rom all have buses that can run in parallel, right? The 4bpp mode may make things kind of ugly and no where near as colorful as galaxy force. 8bpp may eat into the frame rate or borders, but I guess it's all about how you use your colors.

tomaitheous
Posts: 592
Joined: Thu Aug 28, 2008 1:17 am
Contact:

Post by tomaitheous » Thu Oct 07, 2010 12:21 am

Are you sure StarFox ran at 20fps??? IIRC, it was more like ~15fps with nothing onscreen, and at times felt like 10fps or less.

I'm not sure if the sega scaler board for that game runs at 60fps or 45fps/30fps (which space harrier does IIRC). Either way, Galaxy Force looks waaaaaaaayyy smoother in motion and beautiful to boot. Ever play that game in the arcade? It's amazing. Some models of GF had a rotating chair version.

I'm sure you can find the specs for the scaler board used for that, but IIRC from a discussion with charles macdonald a few years back, it's just a glorified sprite blitter onto a frame buffer. There's tilt and later boards added rotation to the frame buffer (but not the sprites). I always thought they looked awesome (way better than flat shaded polygons). Power Drift is another. Steel Gunner 1 and 2 are also similar 3D games (although machine gun games with fixed path/scrolling).

Oh, OutRunners (not OutRun) is also pretty awesome for a sprite scaler game.

mic_
Posts: 922
Joined: Thu Oct 05, 2006 6:29 am

Post by mic_ » Thu Oct 07, 2010 1:08 am

I'm sure you can find the specs for the scaler board used for that
Perhaps the MAME source code contains some info since it emulates the SEGA Y Board. Finding what you're looking for in their online source repo isn't always that easy though.

smkd
Posts: 101
Joined: Sun Apr 22, 2007 6:07 am

Post by smkd » Thu Oct 07, 2010 2:20 am

tomaitheous wrote:Are you sure StarFox ran at 20fps??? IIRC, it was more like ~15fps with nothing onscreen, and at times felt like 10fps or less.
Oh no, it ran terribly like you say. I am just considering a more ideal situation. If this hypothetical GalaxyForce-like game ran at 20FPS with the borders and such.
I'm not sure if the sega scaler board for that game runs at 60fps or 45fps/30fps (which space harrier does IIRC). Either way, Galaxy Force looks waaaaaaaayyy smoother in motion and beautiful to boot. Ever play that game in the arcade? It's amazing. Some models of GF had a rotating chair version.
It's very smooth in an emulator atleast. It looked like 60FPS when I played it. Not happening with an SFX setup ofcourse =(.
I'm sure you can find the specs for the scaler board used for that, but IIRC from a discussion with charles macdonald a few years back, it's just a glorified sprite blitter onto a frame buffer. There's tilt and later boards added rotation to the frame buffer (but not the sprites). I always thought they looked awesome (way better than flat shaded polygons). Power Drift is another. Steel Gunner 1 and 2 are also similar 3D games (although machine gun games with fixed path/scrolling).
Yes I've read something similar. A cached SFX routine with spaced out read/writes should be able to mimic this decently, but I haven't messed around that much with the chip recently. When I wanted to try, bsnes didn't support it and emulators apparently used whole instructions as opposed to clock cycles and it was way, way off. I don't think the cache was even emulated either.

Beyond the neat video setup, I know Sega added custom chips that did the same math the 68k would do in software, but with instantly readable results.

User avatar
Dwedit
Posts: 4273
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit » Thu Oct 07, 2010 7:11 am

I don't think the Neo Geo or GBA could run a game like Galaxy Force 2, even though they have hardware sprite scaling. GBA would hit the scaled sprite limit very quickly.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!

tepples
Posts: 21880
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Thu Oct 07, 2010 7:43 am

The Neo Geo doesn't support real-time sprite rotation at all, so let's analyze the GBA. The GBA PPU has 1210 cycles per line to process sprite pixels. Each unscaled pixel costs one cycle, each scaled sprite costs 10, and each scaled pixel costs 2 cycles. So if you're using 32x32 pixel sprites, each sprite costs 10+32*2 = 74 cycles, and 16 will fit on a line. That's still 16*32/240=213% overdraw, twice as much as the unscaled overdraw close to 100% on Genesis and Super NES. The only time it'd overflow is when drawing those carrier-looking ships that use multiple layers for depth. (I can think of a few workarounds for that.) But the GBA OAM only holds 32 rotation matrices at once, and at some points on the screen, you'll need to upload replacement matrices. For these, you'll want to enable HDMA to OAM, which reduces the sprite rendering time to 954 cycles or 12 sprites on any line when it is active.

The DS's 3D core can be used as 1,500 sprites, each with its own matrix.

Back to the Super NES: The poster child for rotating and scaling quads on SNES is Yoshi's Island. Has anyone tried ripping out its sprite texturing code and benchmarking it?

byuu
Posts: 1544
Joined: Mon Mar 27, 2006 5:23 pm
Contact:

Post by byuu » Thu Oct 07, 2010 11:12 am

The video really, really doesn't do the game justice.

Placed properly in the context of its time, that was probably the most amazing arcade machine I've ever played. You sit inside of this box where everything else is pitch black, have this gigantic freaking monitor, and the entire booth rotates as you move your ship. It was extremely immersive.
emulators apparently used whole instructions as opposed to clock cycles and it was way, way off. I don't think the cache was even emulated either.
If you want to do something cool, I'd suggest trying to utilize the SFX2+SA-1 at the same time. Maybe even use MSU1 and treat the SPC700 as a math coprocessor :D

Other emulators have not improved at all since then, and no, they don't emulate any of the caches.

SuperFX has:
- 256-byte instruction cache with VERY complicated 16-byte row fetching
- instruction pipeline
- ROM buffer cache
- RAM buffer cache
- primary pixel cache
- secondary pixel cache
- high-speed multiplication mode for SFX1 only
- different memory AND cache speeds between SFX1/2
- ROM/RAM access control that can stall SFX at any point

And all of that stuff is parallel. When you plot pixels, it fills the primary pixel cache. Once that pixel cache is filled or once you swap to another tile, it flushes it all over to the secondary cache and starts over.

The part that even I don't emulate is something I've never understood. The secondary cache is supposed to run in parallel with the RAM buffer cache. Basically, when RAM is not being used, the secondary pixel cache can write to it. But when it is, the secondary cache stalls. And if that stalls out, it can eventually stall the primary cache as well.

But when I emulate it, it causes timing issues in all kinds of games. And what happens if you are executing a tight loop directly out of RAM? Does the secondary cache stall forever, or does it interleave operations with the RAM buffer cache? Eg read one RAM buffer, write one secondary pixel, repeat. I have no idea.

There's also the matter of incomplete tiles flushing the secondary cache. Games definitely do this all the time, and you don't get black garble, so it seems necessary that the secondary cache also has to READ from RAM, making the whole thing even more complicated.

I'd say other SFX emulators have completely useless timing, and I've got it about 90% right. But both are really pitiful. If you write SFX2 timing-sensitive code and want to ensure it works, you have no choice but to find a way to run it on real hardware.

smkd
Posts: 101
Joined: Sun Apr 22, 2007 6:07 am

Post by smkd » Fri Oct 08, 2010 5:13 am

About neogeo, while it can't rotate sprites, it does have direct access to 128Mbyte of sprite tiles. The cons are being able to only shrink sprites and the 96 sprite/scanline limit will likely get in the way. Pre-rotated frames at full zoom (possibly multiple sizes) will be extremely large all things considered. Possibly prohibitively large.

Interesting info about the GBA there. Seems that a SuperFX would be better suited for this in the end, even with the limitations considered.
I'd say other SFX emulators have completely useless timing, and I've got it about 90% right. But both are really pitiful. If you write SFX2 timing-sensitive code and want to ensure it works, you have no choice but to find a way to run it on real hardware.
That's too bad. Still haven't imported an SFC and I've been meaning to do that for a while now. But it's something to try out whenever I get the means to do it.

And I never had the chance to play in an actual machine >8(. Sega's old arcade machines looked real nice with the physical immersion.

byuu
Posts: 1544
Joined: Mon Mar 27, 2006 5:23 pm
Contact:

Post by byuu » Fri Oct 08, 2010 7:25 am

An SFC alone isn't really enough to test SFC code, nor is a copier.

The SFX chip sits between the cartridge bus and the ROM/RAM chips.

To test SFX code, you will have to use a serial cable and stop-n-swap with an SFX game, and even then you will be limited to only using RAM to execute your code. Since RAM execution and the pixel caches fight each other, you'll get worse performance.

In order to execute your own code through the SFX ROM path, you would have to solder your own EPROMs onto an SFX game board.

smkd
Posts: 101
Joined: Sun Apr 22, 2007 6:07 am

Post by smkd » Fri Oct 08, 2010 9:03 am

Yeah I guess that post was a bit miselading.

I know about that, it's just that I have no working system at all and I'd prefer an SFC above anything else (urrrghh, PAL). I've done the EPROM socketing/soldering on neo geo carts so it should be familiar territory. When I get my SFC and some SFX cart, I'll probably get into it. I believe there's alot of messy rewiring involved though.

Sik
Posts: 1589
Joined: Thu Aug 12, 2010 3:43 am

Post by Sik » Fri Oct 15, 2010 5:22 pm

Not wanting to be mean, but Galaxy Force 2 is crap... The port of G-Loc looks much better and has a higher framerate as well.

psycopathicteen
Posts: 2915
Joined: Wed May 19, 2010 6:12 pm

Post by psycopathicteen » Sat Oct 16, 2010 10:06 am

I've attempted software rotation using the 65816, but the conversion of sprites from a large packed-pixel bitmap to planar 8x8 tiles takes a lot of complicated and confusing code.

mic_
Posts: 922
Joined: Thu Oct 05, 2006 6:29 am

Post by mic_ » Sat Oct 16, 2010 11:07 am

I guess it depends on how you define "a lot". This is what I use as a post-processing stage when decompressing graphics data: http://pastebin.com/keiKUd5r

The port of G-Loc looks much better
You mean the MD port of G-Loc VS the MD port of Galaxy Force II, right? Because the MD port of G-Loc obviously can't hold a candle against the arcade version of Galaxy Force II.

psycopathicteen
Posts: 2915
Joined: Wed May 19, 2010 6:12 pm

Post by psycopathicteen » Sat Oct 16, 2010 3:38 pm

mic_ wrote:I guess it depends on how you define "a lot". This is what I use as a post-processing stage when decompressing graphics data: http://pastebin.com/keiKUd5r

The port of G-Loc looks much better
You mean the MD port of G-Loc VS the MD port of Galaxy Force II, right? Because the MD port of G-Loc obviously can't hold a candle against the arcade version of Galaxy Force II.
I have an idea. Set $2115 to #$88. Load it to v-ram while rotating the sprite pixel by pixel. Switch $2115 to #$80. DMA it back to CPU side.

Sik
Posts: 1589
Joined: Thu Aug 12, 2010 3:43 am

Post by Sik » Sat Oct 16, 2010 7:26 pm

mic_ wrote:You mean the MD port of G-Loc VS the MD port of Galaxy Force II, right? Because the MD port of G-Loc obviously can't hold a candle against the arcade version of Galaxy Force II.
Yeah, MD vs. MD. The arcade version of G-Loc probably beats the arcade version of Galaxy Force anyways because of the 360 degree rotating seat (that's from where the game takes its name).

Post Reply