Señor Ventura wrote:
So, at the beginning of the active display the VRAM send the scanlines to the TV
The sending happens continuously during active display, because it literally
is active display. The PPU is taking the raw data (register settings, OAM, tiles and maps in VRAM, and colours in CGRAM) and using it to generate a video signal. That is, the PPU is in essence controlling the TV's electron beam in real time. This is why the sprites and backgrounds work the way they do - the PPU has to generate each pixel just in time for the TV to illuminate the corresponding phosphors, so everything it does has to be constant-load.
This is also why VRAM is locked during active display - the PPU has to read it continuously in order to generate pixels on time. Any delays would show up as visual garbage or black areas on the screen. You can in fact turn off rendering (force blank) so as to write to VRAM/OAM/CGRAM during what would normally be active display; this produces black output in the area of the screen that the electron beam is passing over during the forced blanking time, and it also prevents sprites from preloading and can thus glitch the OBJ layer for up to a scanline after rendering is turned back on.
The SNES (along with the NES, Mega Drive, etc.) is very different from a framebuffer-based system.
Quote:
I mean... if you deactivate horizontal scanlines you gain bandwidth, but, What if the scanline is shorter like in the image?
I would think you could use forced blank in an H-position-timed IRQ to do that, if forced blank frees VRAM quickly enough (I think it does, but I haven't tried it). But since interrupts on the SNES can't happen with pixel-perfect positioning, mostly because individual CPU instructions take several pixels to execute and an IRQ won't start until the current instruction is finished, you'd probably want to also use windowing as tepples describes to straighten the edges, or else just map black tiles outside the desired display area. And, obviously, you wouldn't be able to count on having the entire black area for DMA, because the timing inaccuracy of the interrupt would consume up to 16 pixels or so (depending on the code being interrupted and on whether mitigation measures were present in the IRQ code).
And as I just mentioned, you would probably end up killing sprites entirely by doing this, since they don't load during forced blanking.
Not to mention that H-IRQs eat CPU time for breakfast, especially if they contain any position stabilization code, because they happen 200+ times per frame...
In short, it's probably possible, but it doesn't work nearly as well as trimming scanlines off the top and bottom.
93143 wrote:
Why the cpu has extra load if the HDMA is drive with its own bus?.
Because unless the ROM contains ready-made coefficient lists for all possible viewing angles, the CPU has to calculate the transform matrix coefficients for every scanline in order to compile the HDMA tables for the next frame.
Also, as tepples points out, the HDMA is
not "drive with its own bus"; it's part of the CPU and hogs the main system bus entirely when operating. It's much quicker than manual writes with the CPU, but just writing the transform matrix with HDMA still takes about 7% of a scanline, and if you need to write scroll or origin every line as well that goes up to about 10%.
Quote:
Then, the key is that only the number of scanline layers influences in the data volume that the cpu has to send through the HDMA for the transformation, right?
If by "scanline layers" you mean scanlines during which the PPU is set to Mode 7 and displaying the perspective playfield, yes.
Quote:
So, at 4 players, the number of scanlines corresponding to an "mode 7" layer is bigger than during a normal play...
Okay, perhaps I should have checked what single-player looked like in that game. I was thinking of F-Zero, where the Mode 7 layer is most of the screen even in single-player mode...
Yes, 4-player in Street Racer would take more CPU time to handle, even with pre-baked HDMA (which would have eaten quite a lot of ROM, so I doubt they did that).
Quote:
But, what if i do this per software... At 4 players the only thing that matters is the traffic data?.
https://www.youtube.com/watch?v=Tl3gKAobaTEI have no idea what you mean by "traffic data", but...
The bulk of the CPU load in that case is rendering the playfield. As you can see, it's only ~30 fps with big blocky pixels, and it doesn't cover nearly as much of the screen as F-Zero's playfield does. I don't think a 4-player version of that would look good.
Though ultimately it's pretty much the same situation as with real Mode 7, in that the number of players
as such is largely irrelevant to the question of rendering load. A flat, texture-mapped perspective layer is mathematically simple enough that the computational load is mostly proportional to the area of the screen that has to be rendered, rather than whether that area is divided into one, two, or four such layers - as long as there's only one layer per line, since computing the transform for a line is nontrivial and could be significant (I haven't done the math). (Obviously running the rest of the game engine for four players is going to be more expensive than for just one.)
Doing that in software on SNES seems like a dubious proposition. The CPU is somewhat weaker than the Mega Drive (though not as much as the clock speed difference would seem to suggest), and the PPU can do it in hardware anyway. The advantages, I suppose, would be the ability to do corner 4-player rather than pancake 4-player (you can't change Mode 7 parameters arbitrarily in the middle of a scanline without glitching, but if you're rendering to a framebuffer in software you can do whatever you want) and the ability to use maps larger than 1024x1024 (Mode 7 only allows one map, and you can't change where it is in VRAM; this is why Super Mario Kart was a go-kart game instead of a multiplayer F-Zero sequel). I'm not sure it'd be worth it given the resolution and framerate you'd have to put up with; pancake 4-player doesn't look
that horrible in comparison, and if you need a bigger map it might be better to try to pull off something like my
quarter-map scheme (though for 4-player it needs 8 KB updates, which implies a reduced active display height)... Plus, in accordance with what I said above, corner 4-player might take noticeably more CPU than pancake 4-player at the same resolution because it needs twice as many transform matrices...
Quote:
Or do 30 frames per second... is a solution too...
I think Star Fox tops out at 20. Mind you, a lot of the frame rate issues in that game were due to software rendering on the Super FX taking a long time, but the frame data was also too big to transfer in one VBlank, even though they extended VBlank with forced blank. The catch here is that you need to double buffer some of the data in VRAM so you don't get tearing or glitching.
...
Please excuse my huge posts. I like to be precise, but I'm not very good at explaining stuff, and I get drawn off onto tangents very easily.