SNES Doom Source Released! Now What?

Discussion of hardware and software development for Super NES and Super Famicom.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
93143
Posts: 1192
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 » Sun Jul 26, 2020 10:58 pm

Señor Ventura wrote:
Sun Jul 26, 2020 1:31 pm
-At 256x224 seems unnecessary to use the mode 7 to scale the background to fit it in to the screen, cause the picture occupies already all the screen, so the PPU1 multipliers are free to use it with the cpu.
Doom doesn't use Mode 7 anyway. The mosaic trick is only necessary because the rendered area is displayed in Mode 3. Mode 7 has certain restrictions that make it suboptimal for this application.

Also, the more scanlines are used for active display, the fewer are left for DMA to VRAM. Using 256x224 only barely works without tearing because of the mosaic trick and the VRAM HDMA trick, and the fact that the actual rendered area is only 192 lines high. The mosaic trick restricts it to double-wide pixels, and the VRAM HDMA trick requires the programmer to take special measures to avoid problems with sprites.
i don't know how easy is to make the program running at its own speed, and not at the clock speed (to improve the smooth, and not the speed of the action, i'm sure you know what i mean).
That should be pretty easy. IIRC the SNES is the only source of actual timing information, so as long as the code doesn't implicitly assume something about how fast certain tasks get done, it shouldn't much matter how fast the GSU is.

...

I'm not a big fan of overclocking the GSU to get better performance, not as part of a "Doom revisited" project. (Maybe as an "aftermarket" mod to get even better performance...) An overclocked GSU wouldn't have been available at the time, so at that point it's no longer something that could have been a real SNES game.

One possibly arguable exception is the clock trick I mentioned upthread, because it doesn't actually increase the compute speed, just the memory access speed. One could argue that what is essentially a FastROM[+FastRAM?] option for the GSU should have been made available at some point anyway; it's not like anything would have had to change except how long it waits for the memory buses to respond, and the clock trick just fools the GSU into acting exactly like a FastROM GSU would have. However, I consider this trick a last resort even for my shmup port, and for Doom I'd rather not use it.

User avatar
Señor Ventura
Posts: 113
Joined: Sat Aug 20, 2016 3:58 am

Re: SNES Doom Source Released! Now What?

Post by Señor Ventura » Mon Jul 27, 2020 2:56 am

93143 wrote:
Sun Jul 26, 2020 10:58 pm
Doom doesn't use Mode 7 anyway. The mosaic trick is only necessary because the rendered area is displayed in Mode 3. Mode 7 has certain restrictions that make it suboptimal for this application.
I see... i thought doom used mode7 cause wolfenstein 3D uses it from a resolution of about 100x80 to scale the picture until the fullscreen.

How is scaled then if it is using the BG1 in mode 3?.
93143 wrote:
Sun Jul 26, 2020 10:58 pm
Also, the more scanlines are used for active display, the fewer are left for DMA to VRAM. Using 256x224 only barely works because of the mosaic trick and the VRAM HDMA trick, and the fact that the actual rendered area is only 192 lines high. The mosaic trick restricts it to double-wide pixels, and the VRAM HDMA trick requires the programmer to take special measures to avoid problems with sprites.
Understood. You can use a mode 7 game at fullscreen cause it only needs hdma to operate, but if you want to use those multipliers out of the mode 7 you need to not use all the rendering time to communicate with the ppu1 to use it, right?... i mean, is not a simple question of DMA time, but ppu1 time.
93143 wrote:
Sun Jul 26, 2020 10:58 pm
That should be pretty easy. IIRC the SNES is the only source of actual timing information, so as long as the code doesn't implicitly assume something about how fast certain tasks get done, it shouldn't much matter how fast the GSU is.
So, the code actually do assume the time in what all must be processed, so, the more mhz, the more speed you get, but not frames per second, Do i'm wrong?.

I imagine that changing that way of timing (fps and not speed), should be dispersed all along the code cause it should involve various areas.
93143 wrote:
Sun Jul 26, 2020 10:58 pm
I'm not a big fan of overclocking the GSU to get better performance, not as part of a "Doom revisited" project. (Maybe as an "aftermarket" mod to get even better performance...) An overclocked GSU wouldn't have been available at the time, so at that point it's no longer something that could have been a real SNES game.
But we have the GBA's cpu in a shogi game xD

Just kidding... i understand this cause i agree, because of that sometimes i ask myself about the possible maximum accurate hardware that snes could have had.

The SA-1 has its own 16 bits bus to communicate with its own ram (that could have been the WRAM inside the sne's hardware)... that super fx as ppu3 (an programmable cpu to manipulate tiles directly in VRAM, we would have rubbed our eyes)... an WRAM and an DMA faster...

But this is another debate, i only keep myself daydreaming xD
93143 wrote:
Sun Jul 26, 2020 10:58 pm
One possibly arguable exception is the clock trick I mentioned upthread, because it doesn't actually increase the compute speed, just the memory access speed. One could argue that what is essentially a FastROM[+FastRAM?] option for the GSU should have been made available at some point anyway; it's not like anything would have had to change except how long it waits for the memory buses to respond, and the clock trick just fools the GSU into acting exactly like a FastROM GSU would have. However, I consider this trick a last resort even for my shmup port, and for Doom I'd rather not use it.
That was that i meant before, compute speed vs memory access... i supossed it doesn't involves to an small part of the code.

But the grace is in doing it to work at the standard 21 mhz, i suppose (if not, then never will set a limit with the type of the cpu to use).

93143
Posts: 1192
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 » Mon Jul 27, 2020 8:01 pm

Señor Ventura wrote:
Mon Jul 27, 2020 2:56 am
How is scaled then if it is using the BG1 in mode 3?.
It's not scaled. Each pixel is actually drawn twice. The framebuffer is 216x144, and that's exactly what's displayed. Plus a 32-line high status bar that isn't drawn by the Super FX, for a total of 216x176.

Each frame, including VBlank, is 262 scanlines high, so with 176 lines of active display there remain 86 lines during which DMA to VRAM can happen uninterrupted, resulting in a theoretical transfer size per frame of nearly 14 KB. Some of that will be taken up by overhead and additional tasks like sprite and palette updates. At 8bpp, the 216x144 framebuffer is a bit over 30 KB, so it takes a total of 3 VBlanks to load a frame into VRAM. Without getting fancy, that essentially results in a cap of 20 fps (but rendering the frame takes so long that this cap is never reached).
93143 wrote:
Sun Jul 26, 2020 10:58 pm
Also, the more scanlines are used for active display, the fewer are left for DMA to VRAM. Using 256x224 only barely works because of the mosaic trick and the VRAM HDMA trick, and the fact that the actual rendered area is only 192 lines high. The mosaic trick restricts it to double-wide pixels, and the VRAM HDMA trick requires the programmer to take special measures to avoid problems with sprites.
You can use a mode 7 game at fullscreen cause it only needs hdma to operate
Blowing up a Mode 7 image to fullscreen does not require HDMA. It can be accomplished with a constant transform, which means you can set the matrix and forget about it.

Wolfenstein 3D did not use fullscreen. Its display area was 224x192, not 256x224. The framebuffer was 112x80 and was blown up to 224x160.

Doom has a higher resolution than Wolf3D. The display area is only 216x176, but the framebuffer is 216x144 with 2x1 pixels, as opposed to Wolf3D's 224x160 with 2x2 pixels. If this were attempted with Mode 7, it would require a 108x144 framebuffer stretched to 216x144, and since 108x144 is 243 tiles out of the 256 that Mode 7 allows, it would be impossible to display without tearing unless you used the VRAM HDMA trick. Mode 3 does not have this problem because it allows a pool of 1024 tiles, making it easier to double buffer large images.

Note that as far as I am aware, the VRAM HDMA trick was only demonstrated to work earlier this year, and would probably not have been used by any developer back in the '90s. It also causes the sprite layer to glitch out, so sprites have to be turned off in areas of the screen where VRAM HDMA is occurring. This is a significant problem for Mode 7, because you only get the one BG layer, so if you don't want to have to draw the gun with the Super FX, it has to be made entirely of sprites.

The mosaic trick is a method of stretching a non-Mode 7 framebuffer horizontally (kinda - the input format is a bit weird), so as to reduce by half the number of pixels that need to be drawn and transferred to VRAM. It's less edgy than the VRAM HDMA trick, but Randy didn't think of it back in the day, and I am aware of no one who did.

It turns out that if you combine the mosaic trick with the VRAM HDMA trick, you can just barely manage a 256x224 display with a 256x192 Mode 3 rendered window with 2x1 pixels, using a 128x192 framebuffer, at 20 fps (well, except that I doubt the SNES+GSU could run the game that fast - I just mean that DMA bandwidth and VRAM space are sufficient). Without VRAM HDMA, you'd be restricted to 12 fps at that resolution and display size if you wanted to be able to update sprites and change the palette, which you would. The existing Doom port uses neither of these tricks, and its display is pretty close to being as large as it could feasibly get without them.

In Mode 7, the fact that you have a fixed-location pool of only 256 tiles and a fixed-location tilemap of 128x128 tiles means that in the context of a full-screen 256x224 display, the SNES can barely manage 128x96 without tearing, even using the VRAM HDMA trick. The mosaic trick is better than Mode 7 for this application.
but if you want to use those multipliers out of the mode 7 you need to not use all the rendering time to communicate with the ppu1 to use it, right?... i mean, is not a simple question of DMA time, but ppu1 time.
If Mode 7 is not operational, the PPU doesn't care what the CPU does with the multiplier. It can keep rendering the picture in Mode 0-6, and the CPU can use the multiplier in parallel. There's no need to force blank.

If you are using Mode 7, you don't letterbox the screen to get more time to use the multiplier. That's silly. At most, you'd have a substantial chunk of the screen that was guaranteed to not be Mode 7, so you could do your calculations during those scanlines, but you'd have to make sure that all calculations using the PPU multiplier would actually fit in the available time.

The only good reason to letterbox the screen is to get more DMA bandwidth. If rendering is turned off, you can write to VRAM. The longer the time between turning off rendering at the bottom of the picture and turning it on again at the top of the picture the next time around, the more data can be pushed through to VRAM before the PPU needs exclusive access again.
93143 wrote:
Sun Jul 26, 2020 10:58 pm
That should be pretty easy. IIRC the SNES is the only source of actual timing information, so as long as the code doesn't implicitly assume something about how fast certain tasks get done, it shouldn't much matter how fast the GSU is.
So, the code actually do assume the time in what all must be processed, so, the more mhz, the more speed you get, but not frames per second, Do i'm wrong?.
What I meant was that since the GSU doesn't have timing information available to it, the SNES can keep track of the frame pace independent of it, and if necessary tell it how much time is passing. Speeding up the GSU should increase the frame rate but maintain the correct game speed. I don't know if SNES Doom was actually written this way, but it seems like it was - emulators that run the Super FX too fast seem to work as desired, with higher frame rate but correct speed, but I haven't done a rigorous A/B comparison...

The way the original Doom engine on PC worked was that after rendering a frame, it would check how much real-world time had passed since the last game world status update, and run that much time in-game before rendering the next frame. This would result in a consistent game speed regardless of the achievable frame rate on any specific PC (and the achievable frame rate varied wildly). If SNES Doom was written that way, it shouldn't matter what the GSU clock is unless some specific CPU/GSU interface function happens to assume something about the timing. Which, as I said, is possible, but a hack or re-port could remove any such assumptions if they exist.

If any such thing does exist, it might not be a systematic thing like in Star Fox, where overclocking the GSU does actually speed up the game... it could very well just work as normal up to some limiting overclock, and then cause a crash or freakout.

Also, what do you mean "not frames per second"? If you overclock, you will get higher fps regardless of how the engine is written, as long as it's not already running as fast as the SNES can accept the data, which it isn't. The only question is whether the game speed increases together with the frame rate. In Star Fox, it does. In Doom, it probably doesn't (I'm not sure), and in a re-port it would be easy to ensure that it did not.
93143 wrote:
Sun Jul 26, 2020 10:58 pm
One possibly arguable exception is the clock trick I mentioned upthread, because it doesn't actually increase the compute speed, just the memory access speed. One could argue that what is essentially a FastROM[+FastRAM?] option for the GSU should have been made available at some point anyway; it's not like anything would have had to change except how long it waits for the memory buses to respond, and the clock trick just fools the GSU into acting exactly like a FastROM GSU would have. However, I consider this trick a last resort even for my shmup port, and for Doom I'd rather not use it.
That was that i meant before, compute speed vs memory access...
Are you sure? Because I don't think we were discussing this specific thing before. I'm talking about a particular overclock/underclock trick that could improve a particular bottleneck in how the Super FX works internally, while still running the chip at 21 MHz.

The Super FX runs at 10.7 MHz or 21.4 MHz internally based on whether high-speed mode is selected. However, memory accesses (Game Pak ROM reads and Game Pak RAM reads/writes - nothing to do with DMA to SNES VRAM) are slower and can cause bottlenecks. In low-speed mode, the chip accesses memory at 3 cycles per byte. In high-speed mode, it accesses memory at 5 cycles per byte. Both of these values are consistent with 200 ns memory response time, which is Nintendo's SlowROM spec.

My idea was to use a 42.8 MHz oscillator, but leave the Super FX in slow mode. That way, it would think it was running at 10.7 MHz, and would thus use 3 cycles for memory access, but it would actually be running at 21.4 MHz. 3 cycles at 21.4 MHz will work fine with 120 ns memory response time, which is Nintendo's FastROM spec. Just pay for more expensive 120 ns ROM and RAM, and everything should be within spec, or so it seems to me (I have no hardware design experience).

It's almost the sort of thing I think a clever developer could have convinced Nintendo to let them do, and it's absolutely the sort of thing that Argonaut should have built an equivalent of into the chip, because there was no technical reason not to. But I still don't really like it for Doom, because it smells a bit like cheating... and anyway, most memory accesses (other than RAM reads) are buffered, and there's an instruction cache, so you can run code at full speed while the memory is being accessed, and if your code takes longer than the memory access, there's no advantage to speeding up the latter. Rendering pixels in Doom is just complicated enough that reducing memory access time may not speed things up much (at least once the renderers are rewritten to not PLOT directly in vertical columns, which is almost certainly heavily bottlenecked by memory access because of the way the Super FX handles SNES CHR format).
But the grace is in doing it to work at the standard 21 mhz, i suppose (if not, then never will set a limit with the type of the cpu to use).
Exactly.

User avatar
Nikku4211
Posts: 66
Joined: Sun Dec 15, 2019 1:28 pm
Location: Bronx, New York
Contact:

Re: SNES Doom Source Released! Now What?

Post by Nikku4211 » Mon Jul 27, 2020 9:16 pm

93143 wrote:
Mon Jul 27, 2020 8:01 pm
Note that as far as I am aware, the VRAM HDMA trick was only demonstrated to work earlier this year, and would probably not have been used by any developer back in the '90s. It also causes the sprite layer to glitch out, so sprites have to be turned off in areas of the screen where VRAM HDMA is occurring. This is a significant problem for Mode 7, because you only get the one BG layer, so if you don't want to have to draw the gun with the Super FX, it has to be made entirely of sprites.
Doesn't mean we can't use it now. It's like figuring out a more optimal software technique that was never used back in the day but works just fine on stock hardware and then discarding it just because of the time it was discovered in.
93143 wrote:
Mon Jul 27, 2020 8:01 pm
The mosaic trick is a method of stretching a non-Mode 7 framebuffer horizontally (kinda - the input format is a bit weird), so as to reduce by half the number of pixels that need to be drawn and transferred to VRAM. It's less edgy than the VRAM HDMA trick, but Randy didn't think of it back in the day, and I am aware of no one who did.

It turns out that if you combine the mosaic trick with the VRAM HDMA trick, you can just barely manage a 256x224 display with a 256x192 Mode 3 rendered window with 2x1 pixels, using a 128x192 framebuffer, at 20 fps (well, except that I doubt the SNES+GSU could run the game that fast - I just mean that DMA bandwidth and VRAM space are sufficient). Without VRAM HDMA, you'd be restricted to 12 fps at that resolution and display size if you wanted to be able to update sprites and change the palette, which you would. The existing Doom port uses neither of these tricks, and its display is pretty close to being as large as it could feasibly get without them.
You probably would want to cut that down to 128x144 due to the HUD, as it's much bigger than 32 scanlines, and there's no way we're going to do a Boom-style fullscreen HUD on SNES.
93143 wrote:
Mon Jul 27, 2020 8:01 pm
What I meant was that since the GSU doesn't have timing information available to it, the SNES can keep track of the frame pace independent of it, and if necessary tell it how much time is passing. Speeding up the GSU should increase the frame rate but maintain the correct game speed. I don't know if SNES Doom was actually written this way, but it seems like it was - emulators that run the Super FX too fast seem to work as desired, with higher frame rate but correct speed, but I haven't done a rigorous A/B comparison...
Those emulators also seem to have random colourful glitchy pixels pop up everywhere. Don't worry, I've seen videos where on a real overclocked GSU, the game runs faster, but at the correct speed without the glitchy pixels.
93143 wrote:
Mon Jul 27, 2020 8:01 pm
It's almost the sort of thing I think a clever developer could have convinced Nintendo to let them do[...]
Lol, like Nintendo at the time would want to get involved with such a viowent game like Doom. They licenced it, but they probably wouldn't be interested in doing anything more.
I have an ASD, so empathy is not natural for me. If I hurt you, I apologise.

93143
Posts: 1192
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 » Mon Jul 27, 2020 11:44 pm

Nikku4211 wrote:
Mon Jul 27, 2020 9:16 pm
Doesn't mean we can't use it now.
I know. As far as I'm concerned, all software techniques that are reliable on all models of SNES (which, technically, I don't think has been demonstrated yet with this one) are fair game. I was pointing out that it would be unreasonable to expect the original port to use it.
93143 wrote:
Mon Jul 27, 2020 8:01 pm
It turns out that if you combine the mosaic trick with the VRAM HDMA trick, you can just barely manage a 256x224 display with a 256x192 Mode 3 rendered window with 2x1 pixels, using a 128x192 framebuffer, at 20 fps
You probably would want to cut that down to 128x144 due to the HUD, as it's much bigger than 32 scanlines
What do you mean? The HUD (I assume you mean the status bar) is exactly 32 scanlines, on both the PC version and the SNES version. Even if you scaled it up to account for the higher vertical resolution on the SNES (when not letterboxed) vs. PC, you'd still only get 36 lines for NTSC or 38 for PAL. And neither of those is a multiple of 8, so...

The existing port uses a 144-line framebuffer for VRAM DMA reasons - all that black stuff above and below the display is forced blank. I'm talking about getting rid of that.
I've seen videos where on a real overclocked GSU, the game runs faster, but at the correct speed without the glitchy pixels.
I've played through the whole game on no$sns. There are no glitch texels, and it's noticeably smoother than it should be.

There are, however, no save states as far as I can tell, meaning that if you want to play Inferno on Hurt Me Plenty, you have to start from the beginning of The Shores Of Hell and leave the emulator running the whole time...

Oh, and I just tried it in ZSNES. The audio is horrible, of course, but the action is at least as smooth as no$sns, and there are no glitch texels. I think it must be a Snes9X bug specifically...

User avatar
Señor Ventura
Posts: 113
Joined: Sat Aug 20, 2016 3:58 am

Re: SNES Doom Source Released! Now What?

Post by Señor Ventura » Tue Jul 28, 2020 3:50 am

93143 wrote:
Mon Jul 27, 2020 8:01 pm
It's not scaled. Each pixel is actually drawn twice. The framebuffer is 216x144, and that's exactly what's displayed. Plus a 32-line high status bar that isn't drawn by the Super FX, for a total of 216x176.

Each frame, including VBlank, is 262 scanlines high, so with 176 lines of active display there remain 86 lines during which DMA to VRAM can happen uninterrupted, resulting in a theoretical transfer size per frame of nearly 14 KB. Some of that will be taken up by overhead and additional tasks like sprite and palette updates. At 8bpp, the 216x144 framebuffer is a bit over 30 KB, so it takes a total of 3 VBlanks to load a frame into VRAM. Without getting fancy, that essentially results in a cap of 20 fps (but rendering the frame takes so long that this cap is never reached).
Do you think is necessary to use 8bpp tiles having in mind the current needed color depth?

4bpp are 32 Bytes?, or 24 Bytes.
93143 wrote:
Mon Jul 27, 2020 8:01 pm
Blowing up a Mode 7 image to fullscreen does not require HDMA. It can be accomplished with a constant transform, which means you can set the matrix and forget about it.
Oh, i didn't know it, i thought it must be setted in every frame.
93143 wrote:
Mon Jul 27, 2020 8:01 pm
Wolfenstein 3D did not use fullscreen. Its display area was 224x192, not 256x224. The framebuffer was 112x80 and was blown up to 224x160.
Right, i meant to the scaling from frame buffer using mode 7. Someone told me time before that wolfenstein 3D uses mode 7 to fit more in the screen.
93143 wrote:
Mon Jul 27, 2020 8:01 pm
Doom has a higher resolution than Wolf3D. The display area is only 216x176, but the framebuffer is 216x144 with 2x1 pixels, as opposed to Wolf3D's 224x160 with 2x2 pixels. If this were attempted with Mode 7, it would require a 108x144 framebuffer stretched to 216x144, and since 108x144 is 243 tiles out of the 256 that Mode 7 allows, it would be impossible to display without tearing unless you used the VRAM HDMA trick. Mode 3 does not have this problem because it allows a pool of 1024 tiles, making it easier to double buffer large images.
So, you don't obtain clear advantages starting from similar framebuffers, and possibly with tearing. There is no gain, then.
93143 wrote:
Mon Jul 27, 2020 8:01 pm
Note that as far as I am aware, the VRAM HDMA trick was only demonstrated to work earlier this year, and would probably not have been used by any developer back in the '90s. It also causes the sprite layer to glitch out, so sprites have to be turned off in areas of the screen where VRAM HDMA is occurring. This is a significant problem for Mode 7, because you only get the one BG layer, so if you don't want to have to draw the gun with the Super FX, it has to be made entirely of sprites.
How cumbersome could be not using vram hdma in those scanlines wich that has sprites?

Like this:

Image




P.D: I have to interrupt the reply for now, i continue later (and thank you for all your dedication to answering every one of our questions)

User avatar
Nikku4211
Posts: 66
Joined: Sun Dec 15, 2019 1:28 pm
Location: Bronx, New York
Contact:

Re: SNES Doom Source Released! Now What?

Post by Nikku4211 » Tue Jul 28, 2020 8:22 am

93143 wrote:
Mon Jul 27, 2020 11:44 pm
What do you mean? The HUD (I assume you mean the status bar) is exactly 32 scanlines, on both the PC version and the SNES version. Even if you scaled it up to account for the higher vertical resolution on the SNES (when not letterboxed) vs. PC, you'd still only get 36 lines for NTSC or 38 for PAL. And neither of those is a multiple of 8, so...
On the PC version, with proper aspect ratio correction from 8:5 to 4:3, the HUD is 38 scanlines tall on a square pixel display. 38 isn't a multiple of 8, but 40 is, so you can extend the HUD's height a bit.
I have an ASD, so empathy is not natural for me. If I hurt you, I apologise.

93143
Posts: 1192
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 » Tue Jul 28, 2020 4:43 pm

Señor Ventura wrote:
Tue Jul 28, 2020 3:50 am
Do you think is necessary to use 8bpp tiles having in mind the current needed color depth?
Yes. Doom is a 256-colour game through and through. There's no way to get it looking decent at 4bpp, not without fancy tricks that would be impossible in real time on a Super FX.
4bpp are 32 Bytes?, or 24 Bytes.
An 8x8 tile in 4bpp is 32 bytes, vs. 64 bytes for 8bpp. Is that what you meant? I'm not sure of the relevance; obviously 4bpp takes half as much data as 8bpp, but the only place you could reasonably use a 4bpp Super FX framebuffer is the automap. (Interestingly, the existing port does not do this; the automap is in 8bpp even though it doesn't need to be.)
93143 wrote:
Mon Jul 27, 2020 8:01 pm
Blowing up a Mode 7 image to fullscreen does not require HDMA. It can be accomplished with a constant transform, which means you can set the matrix and forget about it.
Oh, i didn't know it, i thought it must be setted in every frame.
It must if you're using the PPU multiplier in the non-Mode 7 areas. Or at least, the two transform matrix elements that you write to in order to use the PPU multiplier will have to be restored because you overwrote them.

If you aren't using the PPU multiplier, there's nothing else that would change the matrix, so it's set-and-forget.

Unless you actually want to change it, of course. If you're zooming in on an image, for example, you have to change the matrix once per frame. If you're doing F-Zero-style perspective, you have to change the matrix once per scanline (HDMA is a good way to do that). But we aren't doing either of those things in this case. If the matrix doesn't need to change, you don't need to change it. It persists; it's not like the OAM on the NES, which will actually lose its data if you don't rewrite it regularly...
93143 wrote:
Mon Jul 27, 2020 8:01 pm
Doom has a higher resolution than Wolf3D. The display area is only 216x176, but the framebuffer is 216x144 with 2x1 pixels, as opposed to Wolf3D's 224x160 with 2x2 pixels. If this were attempted with Mode 7, it would require a 108x144 framebuffer stretched to 216x144, and since 108x144 is 243 tiles out of the 256 that Mode 7 allows, it would be impossible to display without tearing unless you used the VRAM HDMA trick. Mode 3 does not have this problem because it allows a pool of 1024 tiles, making it easier to double buffer large images.
So, you don't obtain clear advantages starting from similar framebuffers, and possibly with tearing. There is no gain, then.
No, there's a gain. Mode 7 only requires half the pixels to be drawn and transferred. And if you use VRAM HDMA (carefully), it works without tearing.

But Mode 3 with the mosaic trick also only requires half the pixels to be drawn and transferred, which wipes out Mode 7's advantage.

(Also, I just realized that 108x144 isn't evenly divisible into tiles; a Mode 7 framebuffer would have to be 112x144 unless you used weird drawing tricks and a mid-screen matrix rewrite. The mosaic trick works differently, and can handle framebuffer widths that are multiples of 4.)

Now, there may be an argument that drawing textured walls in Mode 7 format would be easier than using the PLOT function, because of how the latter operates with the CHR format. However, Mode 7 does seriously restrict the size of the rendered window; 112x144 is basically as large as it can possibly get regardless of transfer speed and buffering issues, simply because of the data limits of the Mode 7 hardware VRAM mapping. Compare with Mode 3, which can support (if my calculations are correct) 224x160 with a reasonable margin of VRAM space for other elements, or 128x192 (or even 128x208 in PAL) with no tearing.
How cumbersome could be not using vram hdma in those scanlines wich that has sprites?
Not very, I suppose. HDMA tables are extremely customizable in terms of which scanlines you want transfers on. But keep in mind that the gun isn't a constant height. Recoil on the shotgun, for example, can take up a lot more scanlines than the pistol does in that screenshot. In fact, just firing the pistol produces muzzle flash that goes higher up the screen than the top of the pistol.
Nikku4211 wrote:
Tue Jul 28, 2020 8:22 am
On the PC version, with proper aspect ratio correction from 8:5 to 4:3, the HUD is 38 scanlines tall on a square pixel display. 38 isn't a multiple of 8, but 40 is, so you can extend the HUD's height a bit.
That's somewhat apples-to-oranges. Your "square pixel display" is 240p, which is not any more "correct" than the 320x200 video mode Doom uses, which would be 4:3 on a properly adjusted CRT monitor. A ~4:3 NTSC SNES image is 224p. So the scaling factor is different, and my 36-line estimate is more correct.

On PAL, the display can be 239 lines high, and your estimate matches mine. But the thing is, PAL's display isn't 4:3. Even at full height, it's more like 22:15. And on PAL there are no more concerns with DMA bandwidth; you can do 128x208 at 3 rpf (~16.7 fps) without even bothering with VRAM HDMA. Which is more important: the aspect ratio of Doomguy's face, or the size (and shape) of the window into the game world?

That's my take right now, anyway. I'm not entrenched in my position.

User avatar
Nikku4211
Posts: 66
Joined: Sun Dec 15, 2019 1:28 pm
Location: Bronx, New York
Contact:

Re: SNES Doom Source Released! Now What?

Post by Nikku4211 » Wed Jul 29, 2020 8:35 am

93143 wrote:
Tue Jul 28, 2020 4:43 pm
That's somewhat apples-to-oranges. Your "square pixel display" is 240p, which is not any more "correct" than the 320x200 video mode Doom uses, which would be 4:3 on a properly adjusted CRT monitor. A ~4:3 NTSC SNES image is 224p. So the scaling factor is different, and my 36-line estimate is more correct.

On PAL, the display can be 239 lines high, and your estimate matches mine. But the thing is, PAL's display isn't 4:3. Even at full height, it's more like 22:15. And on PAL there are no more concerns with DMA bandwidth; you can do 128x208 at 3 rpf (~16.7 fps) without even bothering with VRAM HDMA. Which is more important: the aspect ratio of Doomguy's face, or the size (and shape) of the window into the game world?
I know the scaling factor is different, which is exactly why I used the square pixel scaling into 4:3 as a reference.

Also, I wasn't actually talking about PAL. I did the aspect ratio calculations on my own and got to 38.

The only reason I brought up the aspect ratio of the HUD is so that you can save 8 lines for the HUD so that you don't have to render as much and round up the HUD into a 40-line boundary, with a plus that Doomguy's face is more proportional on a TV to the proportions on a CRT of the time.
I have an ASD, so empathy is not natural for me. If I hurt you, I apologise.

93143
Posts: 1192
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 » Wed Jul 29, 2020 3:25 pm

Nikku4211 wrote:
Wed Jul 29, 2020 8:35 am
I did the aspect ratio calculations on my own and got to 38.
My point is that you did them wrong. Watch this:

PC
Graphics Mode: 13h
Resolution: 320x200
Pixel Aspect Ratio: 5:6
Screen Aspect Ratio: 4:3

SNES (NTSC)
Graphics Mode: 3
Resolution: 256x224
Pixel Aspect Ratio: 8:7
Screen Aspect Ratio: 64:49

status bar on PC: 320x32
status bar on SNES: 256xY (assuming fullscreen)

assuming equal fraction of vertical screen height, Y = (32/200)*224 = 35.84
assuming equal status bar aspect ratio, Y = (32/(320*5/6))*(256*8/7) = 35.11

Notice that at no point does "square pixels" come into the picture. Neither platform uses square pixels, so the concept is irrelevant.
you can save 8 lines for the HUD so that you don't have to render as much
If I were making or directing a re-port, I would want multiple screen size options. At 256x224, the idea is to get as much viewport area as possible (at the cost of having to use low-detail mode). Saving rendering time is for the smaller screen sizes.

Considering that the "correct" status bar height is less than 36 pixels any way you calculate it, I think 32 is better, at least for NTSC. For PAL, I'd consider bumping it to 40 lines, because it would result in a better status bar aspect ratio. While the actual viewport aspect ratio would be wider than on PC even with a 32-line status bar, that may not be a critical problem, and going to a 40-line status bar does mean only 4% more pixels to render per frame vs. NTSC's 192-line maxed-out viewport, rather than 8% more.

Also, you initially suggested a 144-line framebuffer, which without letterboxing would be more like an 80-line status bar on NTSC, or 95 on PAL...

...

Also, I've got an idea. What about an alternate "low-detail mode" with double-high pixels instead of double-wide?

On NTSC I think it could look bad, but on PAL the pixels are wider, and the effective PAR with 1x2 metapixels would be (nominally) about 83% of the PC's native PAR. It could look okay. And unlike the mosaic trick required for conventional 2x1 low-detail mode, the implementation is dead simple: just render a half-height framebuffer and decrement Y-scroll every other line.

Note: the largest BG-type framebuffer supported by the Super FX is 192 lines high. I believe that it is possible to draw in OBJ mode (which is 256x256) in 8bpp; however, this turns out to be unnecessary, as it happens that both horizontal (mosaic trick) and vertical (linescroll) low-detail modes would use a half-height framebuffer internally...

User avatar
Nikku4211
Posts: 66
Joined: Sun Dec 15, 2019 1:28 pm
Location: Bronx, New York
Contact:

Re: SNES Doom Source Released! Now What?

Post by Nikku4211 » Wed Jul 29, 2020 9:06 pm

93143 wrote:
Wed Jul 29, 2020 3:25 pm
Note: the largest BG-type framebuffer supported by the Super FX is 192 lines high. I believe that it is possible to draw in OBJ mode (which is 256x256) in 8bpp; however, this turns out to be unnecessary, as it happens that both horizontal (mosaic trick) and vertical (linescroll) low-detail modes would use a half-height framebuffer internally...
What is this OBJ mode? I've looked it up but I can't find much details about it... I almost thought you were talking about sprites at first.
I have an ASD, so empathy is not natural for me. If I hurt you, I apologise.

93143
Posts: 1192
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 » Wed Jul 29, 2020 9:58 pm

It's a Super FX screen mode register setting designed for drawing sprite graphics. It gives you a framebuffer consisting of four 16x16-tile OBJ tables in a 2x2 block. So, 256x256 pixels. The big difference vs. the BG drawing modes is the tile layout - the BG modes (256x128, 256x160, 256x192) are in column-linear format, which is why Randy's code transfers 9 strips per VBlank (a 216-pixel-wide framebuffer is 27 tiles wide, and there is no 144-line mode so he's using 160 lines and every column has two blank tiles to skip), whereas the OBJ mode uses conventional SNES OBJ table format.

(In case you weren't aware, the big thing with the Super FX's PLOT circuitry is that it draws into SNES tiled bitplane format automatically, so you don't have to do the complicated job of figuring out where all the bits in your pixel go. It uses a couple of the general-purpose registers as X and Y coordinates into the framebuffer, and the X-coordinate auto-increments when you use the PLOT opcode. A side effect of all this hardware assistance is that the framebuffer format can't be freely defined by the programmer; you have to pick from a few predetermined tile mappings.)

I believe all of the framebuffer layouts can be used in 2bpp, 4bpp, and 8bpp modes (although obviously it would be unwise to attempt to use 8bpp graphics for actual sprites on the SNES). BG tilemaps on SNES can be arranged any way you like, so using OBJ mode to render graphics intended for a BG layer is perfectly fine.

User avatar
Señor Ventura
Posts: 113
Joined: Sat Aug 20, 2016 3:58 am

Re: SNES Doom Source Released! Now What?

Post by Señor Ventura » Thu Jul 30, 2020 4:54 am

How reliable is the mosaic trick comparing it with the real definition of an 256 pixels 1x1?.

When the objects are enough far there is a lose of definition, right?.

93143
Posts: 1192
Joined: Fri Jul 04, 2014 9:31 pm

Re: SNES Doom Source Released! Now What?

Post by 93143 » Thu Jul 30, 2020 1:55 pm

Señor Ventura wrote:
Thu Jul 30, 2020 4:54 am
How reliable is the mosaic trick comparing it with the real definition of an 256 pixels 1x1?.
Reliable? It should work every time; it's very simple and doesn't rely on any mosaic restart oddities. It's built on the fact that mosaic is applied after scroll, and if that weren't true you'd get massive glitches in Super Mario World.
When the objects are enough far there is a lose of definition, right?.
Right. Go play the original SNES Doom. You basically have to get good at identifying small groups of pixels that are changing every frame even when you stand still, because those are typically monsters. (Although later in the game there are torches and such that can look like monsters from a distance, but by that point you should have a sense of what colours to expect in hostile pixel clumps as opposed to harmless ones...)

It works fine up close. And even far away it wouldn't be quite as bad if the viewport were bigger. Hence my proposal for a 256x224 low-detail option - if you use the mosaic trick (or linescroll for vertical low-detail mode) the framebuffer is only 256x96, which is fewer pixels than Randy had to render, and the result should be easier to see. Although I should note that a 256x224 display with a 256x192 viewport requires VRAM HDMA in order to update faster than 12 fps, even with the mosaic trick (which is required to fit it in VRAM at all without tearing). VRAM HDMA has not yet been demonstrated to work on every model of SNES, although I do expect it to.

User avatar
Nikku4211
Posts: 66
Joined: Sun Dec 15, 2019 1:28 pm
Location: Bronx, New York
Contact:

Re: SNES Doom Source Released! Now What?

Post by Nikku4211 » Thu Jul 30, 2020 3:30 pm

93143 wrote:
Thu Jul 30, 2020 1:55 pm
When the objects are enough far there is a lose of definition, right?.
Right. Go play the original SNES Doom. You basically have to get good at identifying small groups of pixels that are changing every frame even when you stand still, because those are typically monsters.
Go play the original DOS Doom. Even at high-detail mode, you'd still have to get good at identifying small pixel groups, especially if you're sniping.
93143 wrote:
Thu Jul 30, 2020 1:55 pm
It works fine up close. And even far away it wouldn't be quite as bad if the viewport were bigger. Hence my proposal for a 256x224 low-detail option - if you use the mosaic trick (or linescroll for vertical low-detail mode) the framebuffer is only 256x96, which is fewer pixels than Randy had to render, and the result should be easier to see. Although I should note that a 256x224 display with a 256x192 viewport requires VRAM HDMA in order to update faster than 12 fps, even with the mosaic trick (which is required to fit it in VRAM at all without tearing). VRAM HDMA has not yet been demonstrated to work on every model of SNES, although I do expect it to.
Yeah, and we need more SD2SNES/FXPak real console testers. This VRAM HDMA does work in my silver RetroDuo clone, so that's a start. I do not have any other models of the SNES, official or unofficial.
I have an ASD, so empathy is not natural for me. If I hurt you, I apologise.

Post Reply