It is currently Mon Aug 20, 2018 5:42 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 83 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next
Author Message
PostPosted: Tue May 01, 2018 5:28 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20428
Location: NE Indiana, USA (NTSC)
If cycle count is your primary factor here, then ARMv4 as implemented in ARM7TDMI is not RISC because loads take 3 cycles, and multiplies may take a while depending on the size of the numbers. Nor would most modern CPUs, as L1 and especially L2 cache misses have a huge pile of wait states.


Top
 Profile  
 
PostPosted: Tue May 01, 2018 9:13 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 518
Espozo wrote:
I've always wondered but never asked, is the CISC/RISC denomination pretty much arbitrary? Problem is that it seems there's no real cutoff (or at least I'm not aware of it); people don't seem to know where the 65XX family falls.

I've got to say, seeing all these old, no longer used processor architectures is a bit depressing. :lol: People often like to say that the Saturn/PS1/N64 was the last console generation where the hardware was meaningfully different, and while the graphics are much more similar, I'd probably give this distinction to the Dreamcast/PS2/GameCube/Xbox; it's cool how every single system used a different processor architecture and kind of shows how different the landscape was then, I think. ARM went up through the ranks under it's own merit, but the one thing that's always bothered me is x86; we're still using the same architecture as a now 40 year old processor, although that might not be too fair of a comparison because of how much has been added. I don't know if it's lasted over the years because it's truly a great design, or rather Intel's massive market share and the desire to keep backwards compatibility.


X86 is seen as a really bad architecture, the textbook of not what to do(although 68K is also thrown around a bit as well), given the 8008->8080->8086->186->286->386..... nature of it, even back then it was not seen as being good, just intel was able to keep the backwards compatibility which is the main selling point of an IBM compatible, and they were able to make chips with the most bang, they cost a lot but business doesn't really care about cost in the way we do. A $5000 computer is still cheaper than a person, and having a computer that can do twice as much than a $3000 one is still a plus on paper. They were also able to make chips in small predictable increments that had new epochs every few years and one could mostly drop a new chip into their old machine to get more life/omph out of it. I.e putting a P233 into my P133 computer will actually get me more performance, putting a 16Mhz 6502 into my Commodore 64, won't make an Iota of difference.


Top
 Profile  
 
PostPosted: Tue May 01, 2018 9:18 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 518
I think the 65XX range is a CISC design. Its R-M rather than R-R. The Z80 is also CISC even though it is R-R as it has Micro Code. I think not having Micro Code is seen as a big part of being RISC. From memory there is a lot of debate if the 6502s Decode ROM counts as Micro Code, the different length instructions mean that the Chip has logic to handle different length fetches, 1/2/3 byte instructions while RISC has fixed length fetches which puts the instruction decode directly into the opcode.


Top
 Profile  
 
PostPosted: Tue May 01, 2018 10:20 am 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7400
Location: Seattle
RISC very approximately means:
  • no microcode / microops (unlike x86; note that NEC V20/30 x86 clones had none)
  • register-register architecture
    ** What memory addressing modes are present are often very simple, rarely anything fancier than 6502's ADDRESS,X
  • instructions are constant length. Any operation that was too big to fit into a single instruction gets broken into multiple real instructions.
    ** What do I mean by multiple "real" instructions? All PICs are constant-length (12/14/16/24 bit) and have two pipeline stages (fetch / execute). There are several 16-bit-instuction PIC ops that use the contents of the fetch stage as an additional parameter to the execute stage. The second op is explicitly encoded as NOP a that has lots of "don't care" bits in it. In contrast, MIPS has separate real instructions for "load the upper half" and "load the lower half" of a word.
  • instructions are, in the absence of cache stalls, constant duration. Multiply and division instructions require so much logic that they break this frequently. A single stage unsigned 32-bit multiplier uses drastically more transistors than the entirety of the 6502. A rolled-up one only requires an adder, barrel shifter, and a bunch of AND gates, but then you need 32 cycles for it.
  • all registers are equally capable (even if some are reserved for stack pointer and/or program counter). SIMD instructions almost always break this; a 32-bit register is really rather too small for that. Even a 64-bit register is a little cramped. (see SSE vs AVX)

Accumulator-based architectures don't really fit into the RISC vs CISC paradigm.


Top
 Profile  
 
PostPosted: Tue May 01, 2018 2:10 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2731
Oziphantom wrote:
X86 is seen as a really bad architecture, the textbook of not what to do(although 68K is also thrown around a bit as well), given the 8008->8080->8086->186->286->386.....


68000 really? I don't think it's a bad architecture, it's just that the performance is overrated because 68K nuts always go the extra mile optimizing the shit out of everything.


Top
 Profile  
 
PostPosted: Tue May 01, 2018 4:18 pm 
Online

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7400
Location: Seattle
Natural evolution of ISAs is deemed 'ugly' because it wasn't designed to be The Right Thing In The First Place.

Entirely aside from the question of What The Right Thing Is, this is as absurd as complaining about many other failures to predict the future. It's the hardware version of perfect is the enemy of good [enough].

Backwards compatibility does cause suboptimal things and complexity. CISC is deemed worse than RISC because it (theoretically) "wastes" the transistors that aren't doing things all the time ... but e.g. with early RISC architectures that just meant that they failed to identify other constraints (such as memory bandwidth, cache size, and heat dissipation)


Top
 Profile  
 
PostPosted: Tue May 01, 2018 11:12 pm 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 518
psycopathicteen wrote:
Oziphantom wrote:
X86 is seen as a really bad architecture, the textbook of not what to do(although 68K is also thrown around a bit as well), given the 8008->8080->8086->186->286->386.....


68000 really? I don't think it's a bad architecture, it's just that the performance is overrated because 68K nuts always go the extra mile optimizing the shit out of everything.

I forget the actual number but its something like 60~80% of the die is used on making the perfect balanced orthogonal instruction set. This makes yields lo, number of chips per wafer lo, and hence made the chip large, expensive, power hungry and hot. A dream to program for, you can optimise till the cows come home, but when one looks at actual amount of silicon doing work its not that good. To which the argument is, work out what instructions you actually need, and then reduce the die to only do those things and get cheaper and faster CPUs as a result. Doing A7+ is nice but do you really need it, or would A0+ and A1+ have done for the most part?
To be fair these CPUs where designed in a different era with different problems. RAM was very expensive so having more complex instructions that saved RAM was a huge bonus. Assembly was hand generated( i.e assembled by hand) and there are no macros. So Micro Code is basically a hardware maco. This saved on programming time.


Top
 Profile  
 
PostPosted: Wed May 02, 2018 7:58 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2731
I wonder if the 68000 could've used a PLA instead of a ROM if the developers had time to figure out the patterns. Say you had "add.w $10(d0.w, a2), d5" it could've broke it down like this.

1) Fetch "$10(d0.w, a2)"
2) then do "add.w operand, d5"

Registers selection bits would work independently from the opcode.


Top
 Profile  
 
PostPosted: Wed May 02, 2018 8:12 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 518
Sure you could probably make logic gates to handle the cases. Just its a lot harder to make changes and test with. As each time you add an instruction/fix a thing, you need to change the logic structure and solve it. These days you just put the logic into VHDL and let the computers crunch it in a few hours. Back in 79 not so much. However ROM is pretty compact as its a nice formatted grid. An 8K ROM vs the PLS100N the 8K holds a lot more info for probably similar die size. ROM also has a fixed propagation delay, the optimised gates would have different delay, so you would have to Gate them which then needs another clock .....


Top
 Profile  
 
PostPosted: Wed May 02, 2018 9:08 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2731
I'm thinking there must be some patterns that the 68000 engineers took advantage of. The transistor count was estimated to be 60-70k transistors, and using the fact that there are 16-bit instruction words (with a few exceptions) there would need to be just 1 transistor per instruction. Maybe the register selection fields are ignored by the microcode rom address generator, and maybe there are microcode instructions that specifically pull register fields from the instruction register.


Top
 Profile  
 
PostPosted: Wed May 02, 2018 9:27 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 518
1 transistor per instruction? you clearly have no idea how a CPU works. I mean if you want to OR all 16bits you might be able to get away with 1 transistor, but that is a very very rare case. The CPU has T states, so you it needs to take the instruction, decoded what it needs, then set up the T state counter, then expand itself out to the hundred or so internal signals that it needs to generate from the instruction word. Then step though the counter, while activating the right internal control lines per T state, and exit the T state at the right count.

For example the 6502 from an 8bit instruction code then has a 21x130 decode rom. I.e it takes 21bits(Instruction, Not Instruction(-1), Clock, T counter) in and then generates 130 control signals for parts of the CPU to "do the instruction"


Top
 Profile  
 
PostPosted: Wed May 02, 2018 9:46 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2731
I meant having an individual microcode for every single variation of every instruction would've been impossible.


Top
 Profile  
 
PostPosted: Thu May 03, 2018 2:58 am 
Offline

Joined: Mon Mar 30, 2015 10:14 am
Posts: 275
Quote:
RAM was very expensive so having more complex instructions that saved RAM was a huge bonus.

I don't think it was for saving RAM, the 68k generated code is not know to be compact.
In fact a main feature of the 68k was using slow RAM,but this was true at the start of his conception, and this argument became quickly void when some chips need faster access to the main RAM, like DMA or video chip (if it has no dedicated RAM) ,ST and amiga use a way faster RAM than the 68k needs .

The 68k was a good CPU at the time for workstations, in opposition to his competitors,which were faster but also really way more expensive .


Top
 Profile  
 
PostPosted: Mon May 07, 2018 1:03 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2731
Was there any real why 8086 and 68000 needed 4 cycles to access memory, other than having higher Mhz rating then competitors?


Top
 Profile  
 
PostPosted: Mon May 07, 2018 9:56 pm 
Offline
Site Admin
User avatar

Joined: Mon Sep 20, 2004 6:04 am
Posts: 3541
Location: Indianapolis
psycopathicteen wrote:
Was there any real why 8086 and 68000 needed 4 cycles to access memory, other than having higher Mhz rating then competitors?


Someone who knows 68K might have a better answer, but my guess is that it was limited by the speed of the most affordable memory of it's time, which seems to be 1979, in proportion to what seemed practical for their CPU logic. About all you can do at that point is make the data bus wider.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 83 posts ]  Go to page Previous  1, 2, 3, 4, 5, 6  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group