How well can Metal Slug backgrounds be recreated with tiles?

Discussion of development of software for any "obsolete" computer or video game system. See the WSdev wiki and ObscureDev wiki for more information on certain platforms.
93143
Posts: 1715
Joined: Fri Jul 04, 2014 9:31 pm

Re: How well can Metal Slug backgrounds be recreated with ti

Post by 93143 »

Espozo wrote:I thought I had shown in a picture that you can use BG3 for the middle layer in that scene in conjunction to a window layer, and you'd have enough color for everything but the rocky edge of the BG layer and a few highlights on the tree trunks that you could use a few 16x16 sprites for.
I don't remember that, but it's been a while. I guess that's sorted, then, except for the wooden structure. I wonder if that would be a good candidate for OBSEL switching?

Man, there's really almost nothing happening once that wooden structure comes down. Couple of tanks, couple of soldiers, couple of POWs, then the boss. Also water. Water everywhere.

I wonder how effective the half-overwrite trick would be. It's kinda like 2bpp, but with per-pixel palette granularity, and you can overlap palettes. The far waterfall in particular looks like it might be well set up for it, but it's hard to know without trying...

Why does this video have such slow water? I actually went and looked up footage of somebody playing real hardware, hoping we got it wrong, but the water is fast on real hardware. This honestly looks better in some ways...
lidnariq wrote:It's not clear whether that difference is big enough to matter.
If you're getting slowdown, it's almost certainly big enough to matter. And with Metal Slug you're probably getting slowdown. Whether it matters enough to justify the extra cartridge cost is another question. (With a modern MS port, I'd say yes, because the purpose is to test the limits of the console rather than to make money, and the market expectations of potential buyers, if any, are much more flexible. Even back in the day, shogi players were apparently willing to put up with a lot...)

For my game, the big reason for using fast SRAM would be to speed up my monster raster effect, as I said. Just the faster interrupt servicing and the elimination of the trampoline account for something like 15% more compute time for the main game loop. Whether I'll need it or not remains to be seen; most of the heavy lifting in my game is done by the GSU...
lidnariq wrote:decoding both $3xxx and $5xxx should have required another pin on the MAD-1
Would it be any simpler to map SRAM to $2000-$3FFF and just ignore $21xx? (I'm guessing not, unless the CPU simply doesn't put $21xx accesses on the A bus at all.) How expensive would a custom address translator be?
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: How well can Metal Slug backgrounds be recreated with ti

Post by Drew Sebastino »

93143 wrote:Why does this video have such slow water? I actually went and looked up footage of somebody playing real hardware, hoping we got it wrong, but the water is fast on real hardware. This honestly looks better in some ways...
The Neo Geo video hardware has a custom feature where it can automatically increment the graphics address for each 16x16 sprite block every frame until it reaches a defined point, and then starts over. That's why all the water operates at 60fps (even if, artistically, it really doesn't need to...).
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: How well can Metal Slug backgrounds be recreated with ti

Post by lidnariq »

93143 wrote:Would it be any simpler to map SRAM to $2000-$3FFF and just ignore $21xx?
Nope. The only simple option is a 4K window, by decoding all of A12...A15.
(I'm guessing not, unless the CPU simply doesn't put $21xx accesses on the A bus at all.)
When the S-CPU is accessing $21xx, both the A bus and PA bus have the same bottom 8 bits. If a cart device mapped something into $21xx according to the A bus (instead of a PA bus) the distinction would only be visible when trying to DMA to/from it. (It wouldn't work very well)

Edit: you might be able to do some cleverness with decoding $2000-$3FFF and /PARD ? Might have a problem with race conditions, though.
How expensive would a custom address translator be?
Not bad. ROM-as-logic: SST39SF010 86¢/@26. GAL: ATF16V8 78¢/@26. GreenPak: 50¢/@100. Discretes: 74'1g86 + 74'138 21¢/@10 + 29¢@10
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: How well can Metal Slug backgrounds be recreated with ti

Post by Señor Ventura »

93143 wrote:Sure, the Neo Geo had a 12 MHz 68000, and the game still had slowdown. But the game engine was not very efficient, and they had to rewrite it for Metal Slug X.
That is true, playing this game sometimes you have the sensation of it is executed by brute force.

Even in snes there are run and guns with heavy action, and almost without slowdowns. Starting from here surely a game seemed to the metal slug could be done with good animations (specially if it deactivates scanlines with the excuse of the image proportion).



All we know what that means that... better aspect ratio, and an additional extra bandwidth:

Image
Image
93143 wrote:Also, the SNES CPU is perhaps close to twice as efficient per clock as the 68000. A homebrew port would be heavily optimized; I'd still expect slowdown, but it might not be all that horrible.
As i understand the situation, with an metal slug the priority is the animation and the size of the sprites.

Is a pending debt with the snes. Look at this image:

Image


The SFA2 port has available more than 11KB per frame. Technically drawing two zangiefs with the same size and frame animations than the arcade is possible.

It could be some lack, for example if the two zangiefs lay to the ground at the same time, overloading the pixel fill rate, but there always are methods to solved that issues.
93143 wrote:Note that the original game is 30 fps. This means that it has about twice as much going on as a hypothetical 60 fps game would be able to manage with the same CPU load, and comparisons with existing SNES software (which is almost all 60 fps) should take this into account.
And here another good point, yes.

You only have to look how well the contra 3 moves being a "first wave" game of the console... now multiplie that perfomance by 2, and the result would be monstrous.
93143
Posts: 1715
Joined: Fri Jul 04, 2014 9:31 pm

Re: How well can Metal Slug backgrounds be recreated with ti

Post by 93143 »

Señor Ventura wrote:specially if it deactivates scanlines
Well, yes, that is the emergency fallback, but...
with the excuse of the image proportion
...that's not a good excuse. Remember, SNES pixels aren't square; the image stretches to fill (approximately) a standard 4:3 TV screen. Neo Geo pixels aren't quite square either, but they're closer (actually, the display area was 320 wide, but most games didn't use the full width because the output had an unusually short HBlank period and the sides of the image were generally offscreen, kinda like the pointless overscan on the NES but sideways). The upshot is that a full 256x224 SNES screen is the approximate equivalent of the standard slightly edge-cropped 304x224 Neo Geo screen, which is what Metal Slug uses. The ideal situation for a port would be to only have to scale the graphics horizontally, retaining the 224-line display height and cramming all VRAM updates into the standard 38-line VBlank.
As i understand the situation, with an metal slug the priority is the animation and the size of the sprites.
Yes, the DMA bandwidth is a potential issue. Still, with roughly 10 KB of tile data update capacity per "frame" (at 30 fps), you can do a lot more than with your typical 60 fps SNES game even without extending VBlank, and with prioritization of important information such as actor state changes, a bit of spillover now and then might not be the end of the world. If all else failed, you could take an extra frame to upload data before looping the game engine again, invoking slowdown for bandwidth reasons instead of CPU overload.
The SFA2 port has available more than 11KB per frame. Technically drawing two zangiefs with the same size and frame animations than the arcade is possible.
If I recall correctly, with predictive animation it should usually be possible to do that with ordinary VBlank by double buffering and spreading the load across multiple frames. (I'm not sure if it would be better in a fighting game to occasionally delay animations if both players input bandwidth-heavy commands simultaneously, or to implement a global one-frame input delay.) Metal Slug has so much graphical data all displayed at once that double buffering is probably a bad idea, but unlike SFA2 it's not the end of the world if some animations happen a bit out of sync.
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: How well can Metal Slug backgrounds be recreated with ti

Post by Drew Sebastino »

All this talk about predictive animation and animation getting out of synch, but how jarring would it be to force VBlank down the screen for one frame, far enough to get all the graphics transfered? It couldn't look near as bad as the flickering tidal wave from Super Ghouls and Ghosts, at least...
93143
Posts: 1715
Joined: Fri Jul 04, 2014 9:31 pm

Re: How well can Metal Slug backgrounds be recreated with ti

Post by 93143 »

I don't know about that. It's a really obvious glitch, and as far as I'm aware it occurs nowhere in the commercial library (unlike, say, the cheese grater effect, which is acceptable because Konami did it first). I've only seen it in an early draft of psycopathicteen's Bad Apple video, and it looked pretty bad.

EDIT: Okay, maybe I was wrong. This video seems to suggest that it can happen even in Zelda 3. I honestly don't recall ever seeing it, but maybe that's because it's never big enough to make it past the bezel, which you really can't count on with Metal Slug...

If it were to happen only once every now and then, it might not be too noticeable, but with this game you've got a lot of objects that require continuous animation and stay on the screen for quite a while, so the effect would probably be almost continuous in spots. Much better to use real letterboxing if it comes to that.

In my mind, single-frame animation jitter would look better (you're going to need it anyway if you're spreading updates over multiple frames without double buffering), although I could be wrong and it should be tested at some point. Slowdown would surely be acceptable too, since players know what it is and are frankly expecting it. Furthermore, animation-induced slowdown wouldn't need to stack with processing-induced slowdown if the game were programmed right (ie: if the processing took too long and the animation engine was overloaded, you'd still only lose one frame), and I'd expect them to overlap a fair bit in real gameplay, so the impact could be minor.

On the flip side, extending VBlank situationally as you suggest might actually trigger slowdown, since it reduces your compute time for the frame, and if you're in a situation where you need it you could well be on the edge CPU-wise too. In that situation, you'd have been better off just delaying the next frame and taking one more VBlank to finish the updates. (To be fair, this is a bit of a corner case and doesn't prove much...)

...

It strikes me that predictive loading would be of limited utility in Metal Slug. If we aren't double buffering, any changes would become visible immediately, so all predictive loading would do is permit animation frames to be slid both forward and backward in time to take advantage of gaps in the DMA schedule. But you wouldn't want to load a cel early unless you knew that the next frame was going to be a doozy, and how would you know that before running the frame? I suppose you could simply project all current animations out to the next frame and see if they overload it, which would prevent all animations from running one frame early during sparse segments... But is the additional bandwidth leveling worth the CPU cost to do that? Especially considering that with 30 fps/no double buffering, half the animations are already running one (60 fps) frame early even without predictive loading?

It also occurs to me that it's probably acceptable to outright drop an intermediate animation frame, as long as it's already become obvious what the actor in question is doing. This could free up bandwidth for more important updates if things are really busy for multiple frames.

...

I sure do like typing...
User avatar
Señor Ventura
Posts: 233
Joined: Sat Aug 20, 2016 3:58 am

Re: How well can Metal Slug backgrounds be recreated with ti

Post by Señor Ventura »

93143 wrote:...that's not a good excuse. Remember, SNES pixels aren't square; the image stretches to fill (approximately) a standard 4:3 TV screen. Neo Geo pixels aren't quite square either, but they're closer (actually, the display area was 320 wide, but most games didn't use the full width because the output had an unusually short HBlank period and the sides of the image were generally offscreen, kinda like the pointless overscan on the NES but sideways). The upshot is that a full 256x224 SNES screen is the approximate equivalent of the standard slightly edge-cropped 304x224 Neo Geo screen, which is what Metal Slug uses. The ideal situation for a port would be to only have to scale the graphics horizontally, retaining the 224-line display height and cramming all VRAM updates into the standard 38-line VBlank.
You mean reaching the original sprite sizes, stretching the image?... but then we loss definition.

I think that at 256x224 the image gains nothing without deactivated scanlines when you have to redraw graphics.

Look at this images, not having those black borders is useless, and you loose bandwidth:
Image Image
93143 wrote:Yes, the DMA bandwidth is a potential issue. Still, with roughly 10 KB of tile data update capacity per "frame" (at 30 fps), you can do a lot more than with your typical 60 fps SNES game even without extending VBlank, and with prioritization of important information such as actor state changes, a bit of spillover now and then might not be the end of the world. If all else failed, you could take an extra frame to upload data before looping the game engine again, invoking slowdown for bandwidth reasons instead of CPU overload.
I see an opportunity of having more than 20KB of tile animations per frame, deactivating scanlines, and functioning at 30fps.

Honestly, with 20KB you will animate all the snes can draw (128 sprites, 32 per line), due to it never will put on screen all what neo geo can do.

640 tiles per frame is terrifying, and i bet all you want that only a very few times you will need to come closer to that cipher. It is easyest than loss time managing double buffers, or whatever.
93143 wrote:If I recall correctly, with predictive animation it should usually be possible to do that with ordinary VBlank by double buffering and spreading the load across multiple frames. (I'm not sure if it would be better in a fighting game to occasionally delay animations if both players input bandwidth-heavy commands simultaneously, or to implement a global one-frame input delay.) Metal Slug has so much graphical data all displayed at once that double buffering is probably a bad idea, but unlike SFA2 it's not the end of the world if some animations happen a bit out of sync.
How much it will costs to the cpu all that stuff about predictive animation, double buffers, etc?.

I'm still thinking that is better option the "brute force" option with 20KB per frame, and on top of that you save cpu (if you understand this expression xD).
93143
Posts: 1715
Joined: Fri Jul 04, 2014 9:31 pm

Re: How well can Metal Slug backgrounds be recreated with ti

Post by 93143 »

Señor Ventura wrote:You mean reach the original sprite sizes, stretching the image?... but then we loss definition.
I mean this:
MS_NG_full.png
MS_SNES_full.png
Exactly the same screen size, with all graphics (other than HUD elements) rescaled to correct for the pixel aspect ratio change.

It may not be possible. But it's a goal to shoot for. (Also, you'd want some high-quality rescaling tools to minimize manual rework. These sprite sheets are not small.)
I think that at 256x224 the image gains nothing without deactivated scanlines when you have to redraw graphics.
Your Final Fight example shows identical-sized graphics - no rescaling - with nothing at all of interest in the areas that got blacked out. (And it still looks nicer without the black bars when rescaled to 4:3.) Metal Slug is not like that. When five enemies with bazookas parachute in off the top of the screen, it's better to be able to see them right away.

But your earlier example was rescaled instead of just chopped. That way you can fit all the action in, but you absolutely do lose detail. Losing detail is inevitable because the SNES can't do 304-dot scanlines, and using 512-dot scanlines is impractical for this game. But keeping the detail loss to a minimum is (I think) a reasonable goal.

Letterboxing the screen is absolutely an option. A slight trim (8 pixels or so) might not even require graphics rescaling. But it does make the game less accurate and less impressive (and it makes the graphics more difficult to convert because if you have to rescale vertically too, you get attribute clash in both axes). If it's possible to do a good job without letterboxing, then it should be done without letterboxing.

The Street Fighter games on SNES suffered a fair bit of detail loss because they were redrawn for a shorter screen height. I'd rather not have to do that.
I see an opportunity of having more than 20KB of tile animations per frame, deactivating scanlines, and functioning at 30fps.
Okay, yes, but do we need that? Animation delays, animation frameskipping, animation-induced slowdown - all of these combined might be less obvious and intrusive than letterboxing, and result in a better experience. If they do, then we don't need letterboxing. If they don't, then we may have to trim the screen.
640 tiles per frame is terrifying, and i bet all you want that only a very few times you will need to come closer to that cipher.
The SNES can only address 16 KB of sprite data at a time, unless you change where it looks for sprite data partway down the screen, which is almost impossible to do intelligently except in very specific cases (mostly using sprites as a fake BG layer). Some of the animation in the game is 30 fps, but a lot of it isn't (and during the crazy sections where the most stuff is happening, the game tends to slow down anyway). It's entirely possible that 20 KB per frame is overkill for this game.
How much it will costs to the cpu all that stuff about predictive animation, double buffers, etc?.
I don't think either predictive animation or double buffering is appropriate for this game. Double buffering eats VRAM that could be used for more sprites and/or BG tiles, and predictive animation doesn't work properly without double buffering.

As for what I have suggested - delaying unimportant animation cels, skipping them if they get delayed too much (and if the previous one wasn't skipped), slowing down to take an extra VBlank if important cels don't fit - I don't think it would be all that onerous. You'd just have to check every metasprite cel to see if it's important (you could probably just figure it out beforehand and store an importance bit in ROM), upgrade to "important" if the previous cel was skipped, and do the DMA allocation accordingly. A dynamic VRAM engine already exists (written by psycopathicteen) that takes 20-30 scanlines per frame to allocate 4 KB of sprite DMA; I'm not sure if it spills unallocated data to the next frame or just ignores it...

Now, a re-port of Street Fighter Alpha 2 might make good use of predictive animation and double buffering. It shouldn't be prohibitive CPU-wise; the Haunted games on NES (programmed by tepples) use these techniques in the context of a side-scrolling beat-em-up, and it seems to me that the extra CPU load associated with these techniques should scale more with the number of metasprites being animated than with the amount of graphics data involved.
Post Reply