AWJ wrote:
Basically, the 8080 family is a bit more like a RISC CPU in that calculating effective addresses is a distinct operation rather than something that gets folded into an "addressing mode". You could even say that the fundamental insight of RISC was "hey, instead of having all these addressing modes, let's just go back to the 8080 but add more and wider registers, because as long as you don't register spill doing address calculations explicitly ends up just as fast as doing them implicitly".
Yet practical RISC architectures ended up having a 68000-style pointer + short displacement as an available addressing mode for practical struct field access, rather than requiring an explicit addition every time to seek to the element of an array holding the value for a particular actor (in structure-of-arrays) or to the field of an actor's struct (in array-of-structures). In architectures where the load/store stage of the pipeline sits after the ALU stage, such as classical MIPS, address generation like this is essentially free.
In MIPS, the most "by-the-book" RISC design,
lw $rt, 4($rs) reads from address rs + 4. ARM is even more flexible:
- ldr r0, [r1, #4] reads from address r1 + 4
- ldr r0, [r1, #4]! (pre-increment) adds 4 to r1 and reads from the new address
- ldr r0, [r1], #4 (post-increment) reads from address r1, then adds 4 to r1
- ldr r0, [r1, r2, lsl #2] (register indexed scaled) reads from address r1 + (r2 << 2)
Quote:
Register pressure makes arrays-of-structures painful on the 8080.
And register pressure is one of the first things that the RISCs discarded.
Quote:
You do okay if you're only processing one array at a time, but if you've got two (even if only one is an array-of-structures) you need a register pair for one pointer, a register pair for the other pointer, a register pair to hold the strides, and a loop coun... uh oh, you're already out of registers.
The addition of the postincrement instructions on the GB only makes structures-of-arrays even more of a winner.
Say I store 16 bytes' worth of properties for each actor (player or active enemy) in a side-scrolling game across 16 parallel arrays. Position and velocity are already 10 bytes: X (screen, pixel, subpixel), X velocity (pixel/frame, subpixel/frame), Y (pixel, subpixel), Y velocity (pixel/frame, subpixel/frame), and facing direction. On top of that are actor class, state/animation frame, state transition time, and a couple more bytes related to loading the frame's tiles into VRAM. Would I then need to add the actor ID (in one register) to the base of each array to form HL for every single access?