It is currently Mon Dec 11, 2017 8:25 am

All times are UTC - 7 hours



Forum rules


Related:



Post new topic Reply to topic  [ 147 posts ]  Go to page 1, 2, 3, 4, 5 ... 10  Next
Author Message
PostPosted: Sun May 01, 2016 1:14 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2420
http://atariage.com/forums/topic/197977 ... s-vs-tg16/

I know that this is beating a dead horse, but come on? More self-proclaimed experts complaining about the SNES's CPU who know diddly squat about it. These are the kinds of people who make me feel like I'm only doing SNES programming to win a stupid debate. If these people really did have performance issues with the SNES, it's because they purposefully avoided making any optimizations, because if they were actually trying to get good performance out of the SNES, it would go against their argument. It just really makes me cringe reading this.


Top
 Profile  
 
PostPosted: Sun May 01, 2016 1:24 pm 
Offline
User avatar

Joined: Fri May 08, 2015 7:17 pm
Posts: 1865
Location: DIGDUG
The SNES CPU was a bit slow, comparatively. 1/3 the speed of Genesis, and only twice as fast as NES, i think (not really an apples to apples comparison).

_________________
nesdoug.com -- blog/tutorial on programming for the NES


Top
 Profile  
 
PostPosted: Sun May 01, 2016 1:31 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2420
The Genesis's CPU is only 1.9 Mhz because it divides the clock by 4.


Last edited by psycopathicteen on Sun May 01, 2016 1:43 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Sun May 01, 2016 1:35 pm 
Offline
User avatar

Joined: Fri Nov 12, 2004 2:49 pm
Posts: 7312
Location: Chexbres, VD, Switzerland
Well, everything written on this thread is completely false, and what else can I say ? This was written 4 years ago, what can you do about it today ? What's the point of bringing it here, even ? We're in no way responsible for these people, Arkhan in particular, to invent completely false things about the SNES CPU being slower than NES's.

I'll also add that the general level of those forums seems remarkably low.


Top
 Profile  
 
PostPosted: Sun May 01, 2016 1:55 pm 
Offline
User avatar

Joined: Tue Apr 05, 2016 5:25 pm
Posts: 159
Gradius 3 was a launch title running on SloROM and a home port of an arcade title that already had plenty of slowdown so it's a pretty unfair example to use. Hell, it's actually quite performant compared to, say, Metal Slug 2. I've been doing a lot of planning on my own game and I think I could definitely come up with something on the level of Treasure's games. The CPU gives me enough cycles to work with that I think it is certainly attainable.

I'll be honest though, I'm also getting tired of comparisons between systems regarding which is "more powerful". A lot of such discussion is largely unproductive and borderline manipulative and often devolves into comparisons with the Neo Geo (which I find frankly bizarre). But I also feel like I have to prove a point, so I end up feeling torn. I try not to think about it too much given the level of discourse.

The SNES is a... weird machine, but that's half of why I want to make a game for it. After I'm done I'd like to write up a big technical postmortem about it. Discussion about system quirks and such is fascinating as long as it isn't about competition.

_________________
SNES NTSC 2/1/3 1CHIP | serial number UN318588627


Top
 Profile  
 
PostPosted: Sun May 01, 2016 2:06 pm 
Offline

Joined: Mon Jul 01, 2013 11:25 am
Posts: 228
Well i think this has been already discussed quite a bit...
I'm the first to admit you just can't compare the SNES 65816 to the MD 68000 on their clock speed as these CPU uses complete different architecture. The SNES CPU runs synchronously with RAM and so can access it at each cycle while the 68000 external frequency is only 1/4 of its internal speed. Given these infos you have this :
- SNES CPU internal / external speed = ~3.1Mhz with fast ROM and 2.68 Mhz with slow ROM.
- MD 68000 internal speed = 7.67 Mhz / external speed = 1.92 Mhz
And the external speed is *very* important as it refers how much data you can exchange and more or less how fast you can fetch instructions...
So even if the MD 68000 has a faster internal speed, the SNES 65816 does has a faster memory cycle / external speed.
But then you have to consider than the 65816 only uses a 8 bits data BUS while the 68000 uses a 16 bits BUS so despite its slow 1.92 Mhz external clock you can still do faster transfer with the 68000 CPU. But in the case you're doing many 8 bits operations the 16 bits can be wasteful....
Given all that observations you may think these CPUs are finally close in performance... 65816 better for 8 bits operations while the 68000 can do a better job for >= 16 bits operations, well in fact it's a bit true but not exactly.
The *huge* advantage of the 68000 is its far more advanced architecture, the 65816 is a very simple CPU in comparison, after all it's just a 16 bits extended 6502, even the Z80 looks quite advanced compared to the 6502.
The 68000 has a 16/32 bits architecture, supports multiplication / division and others advanced instructions (as dynamic shift) the 65816 doesn't have, and more importantly it has 8+8 32 bits registers and a very efficient instruction set compared to the 65816.

Still of course you can optimize your code on 65816 and get descent results... but you can do the same for the 68000, optimizations work on any CPU. And definitely if the subject is about what you can do with each CPU, you can do a lot more with the 68000. I barely estimate the 7.67 Mhz 68000 to be almost twice as fast than the 3.1 Mhz 65816 (so with fast ROM).

A good example can be found on the topic where we were speaking about LZ4 unpacking. LZ4 compression is well suited to 8 bits CPU and there is indeed a very fast unpacking implementation working for the 65816. Keeping the classic 8 bits format i was able to obtain an implementation about only 15/20% faster on the 68000 but as soon i modified the compression algorithm to take advantage of 16 bits i was able to obtain 120% faster unpacking code. The thing is that when you are getting farther in the optimization process, you will always tend to use the advantage of 16/32 bits instructions to process more data at once and keep as much data as possible in register to reduce memory accesses, and in that case it will be really faster than the 65816.


Top
 Profile  
 
PostPosted: Sun May 01, 2016 2:56 pm 
Offline

Joined: Thu Aug 12, 2010 3:43 am
Posts: 1589
psycopathicteen wrote:
The Genesis's CPU is only 1.9 Mhz because it divides the clock by 4.

More like 1 MIPS as you'll be doing a lot of 8 cycle operations... but the speed is completely useless without also figuring out how many instructions you actually need (that's the thing about the 68000, it's notoriously slow but generally allows getting stuff done in less instructions, which isn't much of a gain for simple things but can be pretty important for complex stuff).


Top
 Profile  
 
PostPosted: Sun May 01, 2016 4:36 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3152
Location: Nacogdoches, Texas
Stef wrote:
I barely estimate the 7.67 Mhz 68000 to be almost twice as fast than the 3.1 Mhz 65816 (so with fast ROM).

I mean, I guess I'm not one to talk as I don't know enough about the 68000, but that seems a little exaggerated? In the case of an actual game from the time period, not when you're trying to do software 3D rendering which makes use of 32 bit operations and multiplication/division (Even the video hardware on the SNES is less suited for 3D due to the graphics format).

And yes, I did read your reasonings. To me, it seems like both processors are even taking these into account:

Stef wrote:
- SNES CPU internal / external speed = ~3.1Mhz with fast ROM and 2.68 Mhz with slow ROM.- MD 68000 internal speed = 7.67 Mhz / external speed = 1.92 Mhz

Stef wrote:
The *huge* advantage of the 68000 is its far more advanced architecture, the 65816 is a very simple CPU in comparison, after all it's just a 16 bits extended 6502, even the Z80 looks quite advanced compared to the 6502.

Even now...

Stef wrote:
But then you have to consider than the 65816 only uses a 8 bits data BUS while the 68000 uses a 16 bits BUS so despite its slow 1.92 Mhz external clock you can still do faster transfer with the 68000 CPU. But in the case you're doing many 8 bits operations the 16 bits can be wasteful....

Now uneven. But as I said, it's not always needed, so in the case where you're only doing 32 bit moves, it's definitely faster. However, you could also make a program that only does rts's :lol:. I honestly just want to know why you feel the 68000 is 2x as fast as a 65816 at half the clock frequency, as you've said you've worked with both (and I've seen you code for the 68000).

HihiDanni wrote:
compared to, say, Metal Slug 2.
HihiDanni wrote:
often devolves into comparisons with the Neo Geo (which I find frankly bizarre).

Hmm... :lol:

Actually though, the performance of Gradius 3 is probably better than that of Metal Slug 2. Metal Slug 2 already runs at 30fps (despite not having 2x the action) and slows down with as little as two enemies onscreen, and not even Gradius 3 does that. In effect, it's a 20fps game (although really a 15 due to a bug that deals with slowdown).

I'm not entirely sure why Gradius 3 is always the one to blame for terrible slowdown. If I recall correctly, Super R-Type is actually a little worse in that it starts to slow down with less action onscreen, although one thing I've never previously thought of is that there are more collisions to be checked. (Options in Gradius don't stop enemy bullets, while the force and bits in R-Type do.)


Top
 Profile  
 
PostPosted: Sun May 01, 2016 5:28 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
Re: Gradius 3: I've already discussed this before. It's not about the CPU. I'll use the Apple IIGS as a reference point: it runs at 2.8MHz, with a 1MHz bus (needed for classic Apple II/II+/IIC/IIE compatibility), and has virtually none of the graphical capability of the SNES (i.e. almost everything graphical has to be done CPU-side -- there is no "PPU" in a sense). The things you can accomplish with a SNES at stock 2.68MHz is mindblowing in comparison; use of 3.58MHz (for high-speed) isn't going to magically going to decrease CPU cycle times. So really, the SNES is a pretty amazing system.

TL;DR -- My opinion/view mirrors that of Bregalad. The AtariAge forums have a remarkably low signal-to-noise ratio, and plagued with misinformation to boot. The times I've seen good/established information posted there I can count on one hand. I've always classified it as more of a "fan of system X/Y/Z" place, not a place of any technical merit.


Last edited by koitsu on Sun May 01, 2016 5:33 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Sun May 01, 2016 5:32 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2420
Even if the SNES is literally half the speed of the Genesis, if it's easy to get 80 sprites on the Genesis, then it should be pretty easy to get 40 sprites on the SNES (and it is), but for some reason tons of programmers have problems moving more than 4 or 5 sprites. It's like there's a book or something teaching people to program the SNES in a very discreet method that limits the programmer to 4 or 5 sprites.

BTW, why do people mention the fact that "some areas of memory slow the CPU down to 1.79Mhz," as if it is anything significant? The only thing at 1.79Mhz are the joypad registers, something that only gets read once in a frame. How is this anyway more significant than the Genesis's 68000 getting cycles stolen from the Z80 fetching stuff from ROM?


Top
 Profile  
 
PostPosted: Sun May 01, 2016 6:46 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3152
Location: Nacogdoches, Texas
Yeah, I agree with koitsu in that I'd say overall, video hardware has much more to do than a CPU in the case of old 2D hardware where there's no software rendering being done. (Or at least in something that's not a tech demo, which is a different story.) I just feel like a very powerful CPU for 2D hardware is more of bragging rights than anything, or it makes it good for higher level programming languages. Well, I mean, there's a point you want to be at, and then everything past it is purely overkill. I'd say the question is if the 65816 is at least to that, but I'd imagine it is, considering that there are a couple (as in 2) shooters on the SNES that display a bunch of enemies, so I'd say it's at that point, even if just barely. Ofcourse, not every game genre even demands that much CPU power.

Ofcourse, I haven't even finished programming Pong, so what am I to say anything. :lol:


Top
 Profile  
 
PostPosted: Sun May 01, 2016 7:22 pm 
Offline
User avatar

Joined: Tue Apr 05, 2016 5:25 pm
Posts: 159
A few more thoughts I have:

I think arcade spec is kinda overrated because of how easy it is to make a powerful arcade system. Cost is largely a non-factor since you're just selling to arcade operators and not home users. Designing affordable hardware for home use is arguably far more challenging. If you want to make something people can buy, you're going to have to make some compromises. Making the most out of a $200 budget is far more interesting IMHO.

As far as slowdown in shmups, I'd imagine that the biggest factor here is the player's bullets. When you're testing collisions between A number of bullets and B number of enemies, the brute force method involves A * B comparisons which can add up quickly (16 bullets * 8 enemies = 128 comparisons, sheesh), far quicker than testing multiple enemy bullets against a single player-controlled object. There are ways to speed up collision handling, a topic that is still relevant today in the realm of physics engines - you can eliminate possibilities to reduce the number of tests, and you'll get a performance boost as long as the incurred overhead isn't greater than the savings. My current idea involves putting player bullets into a spatial list (two, actually), so that enemies only need to check the bullets within a given subregion. A full grid would likely be too slow so I'm going to have two 1D lists, each along one axis - one for horizontal/diagonal bullets and one for vertical bullets.

I have to wonder though, just how many shmups take the brute force approach to doing collision tests?

_________________
SNES NTSC 2/1/3 1CHIP | serial number UN318588627


Top
 Profile  
 
PostPosted: Sun May 01, 2016 9:01 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2420
Here is a code for collision detection. It is for sprites with center based coordinates, so it is slower than a corner based coordinate system. It's approximately 100 cycles, which should give about 600 collision test, in worst case scenario.

Code:
object_collision:

lda {width}      //4
clc         //2 6
adc.w {width},x      //6 12
sta {temp}      //5 17

lda {x_position}   //4 21
sec         //2 23
sbc.w {x_position},x   //6 29
cmp {temp}      //5 34
bcc +         //2 36
clc         //2 38
adc {temp}      //5 43
bcc no_collision   //2 45
+;

lda {height}      //4 49
clc         //2 51
adc.w {height},x   //6 57
sta {temp}      //5 62

lda {y_position}   //4 66
sec         //2 68
sbc.w {y_position},x   //6 74
cmp {temp}      //5 79
bcc +         //2 81
clc         //2 83
adc {temp}      //5 88
bcc no_collision   //2 90
+;
lda #$0001      //3 93
rts         //6 99


no_collision:
lda #$0000
rts


Top
 Profile  
 
PostPosted: Sun May 01, 2016 11:31 pm 
Offline

Joined: Thu Aug 12, 2010 3:43 am
Posts: 1589
HihiDanni wrote:
I have to wonder though, just how many shmups take the brute force approach to doing collision tests?

Probably many, but even then there are some simple optimizations. Obvious one: keep bullets in their own list so collisions can be done quickly against just bullets rather than every other object. Not so obvious one: treat bullets as 1px large, since checking for a point in a box is faster than checking overlap between two boxes (this will work when bullets are small enough and if not then you can compensate by just making the box larger).


Top
 Profile  
 
PostPosted: Sun May 01, 2016 11:37 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3152
Location: Nacogdoches, Texas
Sik wrote:
Probably many

I hadn't even considered any other way... :lol:

Sik wrote:
Not so obvious one: treat bullets as 1px large

I've always thought this was one of the more obvious optimizations. It's funny, because the R-Type games actually do this backward, in that the ship's hitbox is 1 pixel large and everything else is regular and often slightly larger than the visual representation of the objects. Frankly though, using bullets make more sense.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 147 posts ]  Go to page 1, 2, 3, 4, 5 ... 10  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group