I didn't even know that SNES port was started as unofficial one.
I have read about SuperFX a little and it's architecture is very interesting one (at least for me).
It has 16 general registers (also there is a lot of control registers, but it's not important here) R0..R15, where R15 is instruction pointer, R11 - link register and some other specialized stuff.
R0 is accumulator by default. But in fact there are two internal registers Sreg and Dreg - they are numbers of source register and destination register.
After reset and in the end of execution of instructions which uses source/destination registers they are zeroed, that is are pointer to R0/accumulator.
But there are three instructions to change them:
Code: Select all
FROM Rn ; opcode 0xBn - sets Sreg to n TO Rn ; opcode 0x1n - sets Dreg to n WITH Rn ; opcode 0x2n - sets Sreg and Dreg to n
Code: Select all
ADD Rn ; opcode 0x5n
That is ADD Rn adds Rn to R0 by default.
But sequence of instructions:
Code: Select all
FROM R3 TO R4 ADD R5
Well, this is interesting.
Moreover - there is no dedicated "MOVE Rn, Rm" instruction code. Instead of that prefix instruction "TO Rn" works as "MOVE Rm, Rn" if it is preceded by "WITH Rm" instruction. That is "MOVE" is two-byte instruction reusing WITH/TO 1-byte prefixes.
Interestion instruction set and I think it's not RISC at all. It's full of prefixes/immediate data/different instruction formats.
However some things resemble RISC. For example there is no CALL instruction. Instead it has four LINK 1...LINK 4 instructions (opcodes 0x91-0x94) which copies R15+imm to R11 (link register). Usually this instruction is followed by immediate word transfer to R15 (PC) "IWT R15, #proc_addr16" and later R11 was used as return address. But different LINK offsets are used for different loading schemes of R15 instruction pointer.
So there are a lot of prefixes/extended opcodes/immediate data/specialised registers and asymmetries in operations for me to honestly claim that SuperFX architecture is RISC.
(There's also the fact that it was somewhat specialized for drawing graphics. The merge opcode doesn't make much sense until you realize it's ideal for texture mapping...)
I wonder how much more expensive a Super FX with full 16-bit external data busing would have been. Not only would memory access have been faster, but the instruction set would have had room to breathe. I imagine the instruction cache might have had to be a KB instead of 512 bytes... Memory compatibility with the S-CPU wouldn't have been an issue because it's easy to use the bottom address bit as a half-word selector...
(There's also some getb; color; plot; from R4; color; loop; plot, which by how the color sources are handled looks like it might be dither code (you can't autodither at 8bpp unless I'm greatly mistaken). And some plot; plot; plot; plot; plot; plot; plot; plot, which was amusing but I'm not sure what it could be for. Maybe it's for clearing the map screen, but why does it show up more than once? None of these methods appears to be using a secondary buffer or any way to run multiple columns at a time.)
Looks like Randy didn't try to get fancy with the renderer. Unless I missed something, or the debugger misinterpreted code as data somewhere, most things seem to be drawn directly with plot in individual columns of double pixels. Based on my understanding of the Super FX, I believe it is possible (at least with some CPU-ROM added; this game is pretty squished I'm told) to speed things up significantly.
Take with a grain of salt; I haven't done a thorough code review or anything...