Why did Super Mario RPG and Kirby Super Star use an SA-1?

Discussion of hardware and software development for Super NES and Super Famicom. See the SNESdev wiki for more information.

Moderator: Moderators

Forum rules
  • For making cartridges of your Super NES games, see Reproduction.
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by psycopathicteen »

Yes, but I would be able to have multijointed bosses that take up the entire screen, and I wouldn't have to reuse the same sprites for arms and legs.
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by Oziphantom »

lidnariq wrote:We know they provided a set of sample routines ("library"?) for using the mouse, super scope, and multitap: Fullsnes § detecting controller support by searching for magic strings inside ROM images
There is the How to read a Joypad, mouse, scope etc and How to Init the Snes code in the 2 books. Which where the rolls-royce of information about the machine and didn't come out for a while. Most devs I know were the "they gave us a couple of pieces of paper and didn't even mention Mode-7, then they released Mario-Kart and we were HOW HOW HOW", The programmers mostly would just poke random address to see what it did, to work the machine out :roll: That being said the PS2 came with about the same, although it at least had 6 books to tell you how things worked ;)
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by Drew Sebastino »

psycopathicteen wrote:I would be able to have multijointed bosses that take up the entire screen
Not if you hit the sprite tile per line limit first though. How much ram is used with the different rotations anyway?
Pokun
Posts: 2681
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by Pokun »

MottZilla wrote:
Espozo wrote:
MottZilla wrote:Isn't that what most people complain about with the SNES? The lack of processing power.
Most people see how Super R-Type and Gradius III run like shit, and then look just at 3.58 and compare it to 7.68 without taking into account memory accesses per cycle, or bus width, or ISA, or whatever. :?
There are other examples too but I think those are infamous for slow down. I think you are saying how people compare the SNES CPU clock against the Genesis without understanding they are different designs. But what about comparing the SNES to the TurboGrafx 16/PCE which does share CPU design? And its CPU really does run at over 7mhz. And it came out prior to the SFC.
NEC was one of the pioneers promoting the CD format in Japan in the eighties, so they thought it would make sense to make their own home console also support it. For this reason they used very fast RAM and other design choices that was necessary for a CD unit to work well with it. This lead to it being very good for arcade ports, and the CD games was able to compete in the 16-bit era for a while. The PC Engine is probably even faster than the Megadrive, but it was also very expensive when it was released, and was sold for below the manufacturing cost.

I guess one reason the SNES is so slow is because they initially designed it with Famicom compatibility in mind, and used an 8-bit data bus. I guess the 65816 was expensive to make run on a faster clock at the time.

Oziphantom wrote:
lidnariq wrote:We know they provided a set of sample routines ("library"?) for using the mouse, super scope, and multitap: Fullsnes § detecting controller support by searching for magic strings inside ROM images
There is the How to read a Joypad, mouse, scope etc and How to Init the Snes code in the 2 books. Which where the rolls-royce of information about the machine and didn't come out for a while. Most devs I know were the "they gave us a couple of pieces of paper and didn't even mention Mode-7, then they released Mario-Kart and we were HOW HOW HOW", The programmers mostly would just poke random address to see what it did, to work the machine out :roll: That being said the PS2 came with about the same, although it at least had 6 books to tell you how things worked ;)
Heh it was even harder during the NES era when western developers where given only partly translated dev docs. Japanese third-party developers also had a hard time from what I've heard from David Siller, Nintendo didn't want any third-party support at all first. I guess they finally designed the proper dev doc and even translated it into English in the end though. It explains all hardware features including Mode 7.
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by psycopathicteen »

Espozo wrote:
psycopathicteen wrote:I would be able to have multijointed bosses that take up the entire screen
Not if you hit the sprite tile per line limit first though. How much ram is used with the different rotations anyway?
The Plasma Grinch takes 72kB:
Head: 32 kB for 64 32x32 frames
Body: 16 kB for 32 32x32 frames
Limbs: 16kB for 32 32x32 frames
Joints: 8kB for 64 16x16 frames
User avatar
MottZilla
Posts: 2837
Joined: Wed Dec 06, 2006 8:18 pm

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by MottZilla »

93143 wrote:
But what about comparing the SNES to the TurboGrafx 16/PCE which does share CPU design? And its CPU really does run at over 7mhz. And it came out prior to the SFC.
But it doesn't share the same CPU design. The HuC6280 is 8-bit. It's a slightly souped-up NES CPU.

All that demonstrates is that Nintendo wasn't absolutely locked to low clock speeds once they picked the architecture - they could have customized the CPU to get rid of the phi1/phi2 nonsense and doubled the clock, but they didn't. It does not mean the PC Engine CPU was twice as powerful.
I didn't say the "same CPU design". I said they share CPU design. They are both related to the 6502. So you can compare them better than you can to the Genesis and its 68000. I also didn't say the PCE's CPU was twice as powerful. Although I'd imagine it has an advantage over the SNES. But you're right that what NEC did does illustrate that Nintendo could have had a faster CPU if they'd wanted. We can only speculate but I imagine it is the way it is due to cost savings.
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by psycopathicteen »

Did SNES carts typically have more data in them than PCE carts?

Off topic, but does writing slow code really save time? What if code becomes so sloppy that it slows down development, and the only way to clean it up is by optimizing it?
niconii
Posts: 219
Joined: Sun Mar 27, 2016 7:56 pm

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by niconii »

It seems HuCards (the cartridge format for the PC Engine) generally ranged from 256 KB to 1 MB. Apparently there was only a single PC Engine game on HuCard larger than 1 MB, namely Street Fighter II' clocking in at 2.5 MB.

As for writing slow code, it depends. Of course there are things that, for instance, happen every frame, and those generally need to be well-optimized. However, sometimes you'll have code that you know won't need to be that fast. For instance, level decompression/loading doesn't have to be that fast, because there's ways to hide it, like screen transition effects you can hide the loading time behind. If you're smart about optimizing the stuff that really has to be, it can certainly save development time.
93143
Posts: 1718
Joined: Fri Jul 04, 2014 9:31 pm

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by 93143 »

MottZilla wrote:They are both related to the 6502. So you can compare them better than you can to the Genesis and its 68000.
Yeah, but what Espozo was talking about was people blindly comparing clock speeds. Going from 8-bit to 16-bit is way too big a difference for that sort of comparison to be anywhere near right. Not to mention that the instruction sets went in different directions from the 6502 baseline.
I imagine it is the way it is due to cost savings.
Probably.

It's also possible that some of this extra stuff would have fit, but they didn't think of it, or have time to implement it, or something like that. If the current technical information re: the Switch is accurate, my understanding is that Nintendo could have literally doubled CPU power, RAM bandwidth and maybe even GPU power without appreciably harming battery life, simply by going to 16 nm FinFET (less power consumption at a given clock speed) and customizing the chip more (A72/73 with 128-bit bus and 3 or 4 SMs). Then again, that could be cost savings too; 20 nm is supposedly more expensive than 16 nm FinFET, but that may not apply when the vendor is trying to dump an unwanted wafer contract... and of course adding SMs to the GPU would eat die space, even if the impact on battery life were minor at such low clocks... naturally the design process itself wouldn't have been free...

(That's the impression I got by lurking on NeoGAF, anyway. Anyone here know better? Or am I too far off topic?)
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by Oziphantom »

psycopathicteen wrote:Did SNES carts typically have more data in them than PCE carts?

Off topic, but does writing slow code really save time? What if code becomes so sloppy that it slows down development, and the only way to clean it up is by optimizing it?
Pretty much, in most cases LOADS of time. Lets take a really simple example.
Given I have 7 sprites and I want to set their x,y and image ptr. Let imaging that it makes a square box, like a dialog or something. The dumb way of doing this would be something like

Code: Select all

lda #136
sta $d000
lda #120
sta $d001
lda #160
sta $d002
lda #120
sta $d003
lda #136
sta $d004
lda #141
sta $d005
lda #160
sta $d006
lda #141
sta $d007
lda #80
sta 2040
lda #81
sta 2041
lda #82
sta 2042
lda #83
sta 2043
Which is code you can write straight off the top of your head, you don't really need to think about it. However it is really bad code. To which I now have to sit and think about a couple of options. 1 all of the addresses are in a line, so I could just use ,x but that eats more CPU time, do I need this code to be clock fast, am I in a VBlank for example... Maybe since there are shared values it would be not to much room to just unroll the loop with the values... well lets have a stab at the ,x method. looking at the code there is a two groups, the D0XX and the 204X so let make a couple of loops, first lets make sure that X doesn't hold an important value I want to preserve, or find a better place where I don't care about X before I trash it

Code: Select all

    ldx #7
-   lda SpriteXYs,x
    sta $d000,x
    dex
    bpl -
    ldx #3
-   lda SpritePtrs,x
    sta 2040,x
    dex
    bpl-

SpriteXYs .byte 136,120,160,120,136,141,160,141
SpriteData .byte 80,81,82,83
Much nicer, less to type, but I had to scroll through the code to find a place to park the data, but still, two loop is a bit odd right, I mean it could be 3x4 so lets fix the code

Code: Select all

    ldx #3
-   lda SpriteXYs,x
    sta $d000,x
    lda SpriteXYs+4,x
    sta $d004,x
    lda SpriteData,x
    sta 2040,x
    dex
    bpl -

SpriteXYs .byte 136,120,160,120,136,141,160,141
SpriteData .byte 80,81,82,83
well if I don't need Y, I can get rid of the SpriteData, nah for 4 its not worth it, maybe if I was setting all 8 it would be..

This is all well and good but we are in a production house and we use macros because they are awesome and setting a sprite is something we do a lot so we have a

Code: Select all

setSprite .macro
  lda #\2
  sta $d000+(\1*2)
  lda #\3
  sta $d001+(\1*2)
  lda #\4
  sta 2040+\1
.endm
so the first code becomes

Code: Select all

#setSprite(0,136,120,80)
#setSprite(1,160,120,81)
#setSprite(2,136,141,82)
#setSprite(3,160,141,83)
which was a lot faster to type than anything above. And it also doesn't have a giant wall of code for you to scroll past and go hmmm that is a lot of code, I should fix it...

Now lets imagine for a second that the artists decided to put a small flourish in the bottom right corner of the dialogue box, but the flourish goes above the sprites top pixel, but as there was room left below the normal line they put the flourish all one sprite as it was easier for them to visualise it, so now the bottom right sprite needs to be moved up 4 pixels so everything lines back up again. Go through the examples above and make the modification you need to make in your head and then tell me which one was the fastest for you to understand, see and modify what needed to be done?

Now imagine you moved the DB address on the 65816 and then when something else fired it didn't handle having the DB pointer moved on it, and it randomly crashed, and the lead programmer had to spend 2 days and $1000 in wages trying to hunt down the bug and fix it, for 20 clocks ;) but if you used the slower long absolute it wouldn't have happened.
calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by calima »

93143 wrote:It's also possible that some of this extra stuff would have fit, but they didn't think of it, or have time to implement it, or something like that. If the current technical information re: the Switch is accurate, my understanding is that Nintendo could have literally doubled CPU power, RAM bandwidth and maybe even GPU power without appreciably harming battery life, simply by going to 16 nm FinFET (less power consumption at a given clock speed) and customizing the chip more (A72/73 with 128-bit bus and 3 or 4 SMs). Then again, that could be cost savings too; 20 nm is supposedly more expensive than 16 nm FinFET, but that may not apply when the vendor is trying to dump an unwanted wafer contract... and of course adding SMs to the GPU would eat die space, even if the impact on battery life were minor at such low clocks... naturally the design process itself wouldn't have been free...
Usually smaller processes are far more expensive, often by 1.5-2x. Customizing a chip is also expensive, up to several million per shot, and there will be multiple since mistakes happen.

Disclaimer: I lurk semiaccurate, which I consider a higher quality source than a gaming forum, but I'm no factory pro.
Pokun
Posts: 2681
Joined: Tue May 28, 2013 5:49 am
Location: Hokkaido, Japan

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by Pokun »

Nicole wrote:It seems HuCards (the cartridge format for the PC Engine) generally ranged from 256 KB to 1 MB. Apparently there was only a single PC Engine game on HuCard larger than 1 MB, namely Street Fighter II' clocking in at 2.5 MB.
Yes, though most PC Engine games used the CD format so they naturally had lots more data. Especially later games in the PC Engine's life was for CD.
93143
Posts: 1718
Joined: Fri Jul 04, 2014 9:31 pm

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by 93143 »

calima wrote:Usually smaller processes are far more expensive, often by 1.5-2x.
I had acquired the impression that in this specific case, all else being equal, the newer process would be cheaper, either because 20 nm was just that bad or because nobody is using it any more.

But there were also rumours that Nintendo had gotten a ridiculously good deal from NVidia, and rumours on top of that suggesting that this may have been because NVidia was trying to dump their 20 nm commitments without paying contract termination penalties.

Of course, it may well be that none of this is true, and that there simply wasn't a lot of low-hanging fruit to be had...
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by psycopathicteen »

Nicole wrote:If you're smart about optimizing the stuff that really has to be, it can certainly save development time.
That's what I've always figured, it just bugs me when people think you can't write code that is both optimized and maintainable under time constraints. I've seen programmers who thought this:

Code: Select all

sep #$20
ror $01
ror $00
ror $01
ror $00
rep #$20
lda $00
was more maintainable than this:

Code: Select all

rep #$20
lda $00
ror
ror
sta $00
simply because optimizations are "risky".
User avatar
MottZilla
Posts: 2837
Joined: Wed Dec 06, 2006 8:18 pm

Re: Why did Super Mario RPG and Kirby Super Star use an SA-1

Post by MottZilla »

psycopathicteen wrote:Did SNES carts typically have more data in them than PCE carts?

Off topic, but does writing slow code really save time? What if code becomes so sloppy that it slows down development, and the only way to clean it up is by optimizing it?
Because the SNES only used cartridges and outlasted the PC-Engine it has many more large ROMs. The PC-Engine also being out in 1987 meant early games were limited by what ROM sizes were practical during that time. If the SNES was out earlier then you would have seen more early games with smaller ROM sizes. And if PC-Engine's HuCard format had been popular until 1996 or 97' like the SNES then you would have seen more large games and larger than were released.

Street Fighter II' was mid-1993 at 20 Megabits. Parodius Da! at Feb 1992 was 8 Megabits. Ghouls n Ghosts and 1941 for the SuperGrafx (which used the same HuCard format) are 8 megabits and released in 90' and 91'. So the PC-Engine seems it could have kept up if needed. But as Pokun pointed out there was a big shift toward the CD-ROM, Super CD-ROM, and later Arcade CD-ROM formats. The Arcade CD format allowed for 16 megabits of RAM in addition to the Super CD-ROM's 2 megabit. That's a whole lot of memory both for storage and random access considering it was available in March 1994. Plenty of SNES and Genesis games released at the same time period were smaller than that. I think Ninja Gaiden Trilogy released on SNES in 1995 was only 12 megabits.
Post Reply