Look at this big list of homebrew games...
http://www.vintageisthenewold.com/the-v ... rum-games/
and I bet that's not close to all of them.
Is there a preferred emulator? I would like to play some of these.
I wish I had more motivation to go after european platforms, the ZX sounds like a cool platform to experiment on. Having to deal with monochromatic color clash is pretty unique.
If an NES or C64 veteran were to experiment with the ZX Spectrum for the first time, he would first have to spend time wrapping his head around the Z80.Punch wrote:ZX sounds like a cool platform to experiment on.
In subroutines in 6502 games that move an object of a particular type, I've often needed random access to the fields of an object. Say field 5 of a particular object is its health.
- Read field 5 of the object with array index X in struct-of-arrays (6502, C64)
lda object_health,x: 4 cycles, 3.9 μs
- Read field 5 of the object pointed by A0 in array-of-structs (68000, Amiga)
move.w 10(a0),d0: 12 cycles, 1.7 μs
- Read field 5 of the object pointed by IX in array-of-structs (Z80, ZX Spectrum)
ld a,(ix+5): 19 cycles, 5.4 μs
Some 8080-family coders suggest organizing data for a single-instruction, multiple-data (SIMD) approach that steps through two arrays at a time using DE and HL. For example, a program would update object positions in four passes: add horizontal acceleration to horizontal velocity of all objects, then add horizontal velocity to horizontal position of all objects, then add vertical acceleration to vertical velocity of all objects, then add vertical velocity to vertical position of all objects. Pixel shaders on modern graphics programs work similarly. But that doesn't work so well when a particular operation needs to be performed only on particular types of object or only in certain circumstances, which is often the case for enemy AI code that has to determine the accelerations for each object based on collision response, displacement to the player object, and other factors.
If I have asked before, and I am asking again, it's because when I did get an answer, it was one that I couldn't understand how to apply. Hence "wrapping my head around the Z80". Would basic questions like this be better received in a forum organized around a particular platform using an 8080 family CPU, such as Spectrum Computing (or, in the case of other platforms, SMS Power or gbdev.gg8.se)?
- Formerly WheelInventor
- Posts: 2032
- Joined: Thu Apr 14, 2016 2:55 am
- Location: Gothenburg, Sweden
When comparing C64 with Z80 systems like Spectrum or CPC, I would say that Z80 is a good bit faster. Parts because it's having more register & requires less memory accesses, and parts because it's having 16bit load/store/add/inc/dec instructions. The drawback is that Spectrum and CPC need to do sprite rendering by software.
Yeah, I've spotted that 2-dimensional array indexing trick via H and L in several CPC games. It's somewhat cool, but it's also a bit messy, and INC H isn't that much faster than using ADD HL for switching to the next array entry, so, in most cases I've "cleaned up" the code and removed that technique after disassembling the game.
Btw. I was always believing that it's been a 6502 technique that had crept into Z80 world when people ported C64 games to CPC. I haven't checked if the original C64 code had used the same method, but it would make more sense there (since the 6502 can't easily do something like ADD HL for incrementing 16bit array indices).
There are a few more cases where one case "see" that many Z80 games came from C64, like cases where the C64 needed a handful of opcodes to do some operation, and the programmer blindly reproduced that code - although the Z80 could do the same thing using only a single opcode.
I mentioned specifically random access to an object's fields, not sequential access to its fields. I doubt that LD A,(IX+5) is slower than five INC L instructions, even if the structure is 32-byte aligned so that it does not cross pages.nocash wrote:LD A,(IX+5) isn't the best example. Those IX/IY instructions are looking nice at first glance, but they are so slow that one should usually avoid using them, or use them only in not so timing critical higher functions, or, in rare cases, where they are faster than opcode constructions. But in general, using LD A,(HL) is fastest. If you want to read from HL+1, HL+2, etc. just increment HL. Or increment L, that a bit faster, as long as no 100h-byte page crossing is needed.
Code: Select all
; Pre- and post-condition: HL points to the start of the ; 32-byte-aligned structure representing one enemy's state. inc l ; 4 inc l ; 4 inc l ; 4 inc l ; 4 inc l ; 4 ld a,(hl) ; 7 dec l ; 4 dec l ; 4 dec l ; 4 dec l ; 4 dec l ; 4 ; Total: 47 cycles ; Pre- and post-condition: HL points to the most recently accessed ; field of the 32-byte-aligned structure representing one enemy's ; state. Which field happens to have been most recently accessed ; is not known to this fragment. ld a,0E0h ; 7 and a,l ; 4: HA points to the start of the structure or a,5 ; 7: HA points to the desired field ld l,a ; 4: HL points to the desired field ld a,(hl) ; 7: A contains contents of desired field ; Total: 29 cycles ; Pre- and post-condition: Zilog CPU, and IX points to the start ; of the 32-byte-aligned structure representing one enemy's state. ld a,(ix+5) ; 19 ; Total: 19 cycles
The gameboy doesn't have a real Z80 (it doesn't have any IX/IY registers), so one would actually need to use HL in the above fashion.
On CPC and Spectrum it might be best/easiest/fastest to use (IX+n) for the enemy logic in some places - but, on that computers, the main CPU load will go to software rendering, so the speed of the game logic doesn't matter too much, and it's more important to use fast opcodes in the rendering functions.
If you're accessing multiple fields of the same object in the same basic block, you can rely on the previous values of the low bits of L. This makes pointer arithmetic with XOR almost as fast as IX/IY indexing.
Code: Select all
; Pre-condition: HL points to field 3 of an object ; Post-condition: HL points to field 9 of the same object ld a, 3 ^ 9 ; 7 xor a,l ; 4: HA points to the desired field ld l,a ; 4: HL points to the desired field ld a,(hl) ; 7: A contains contents of desired field ; Total: 22 cycles
The game is a port of the same title for GameBoy.
Article: http://www.indieretronews.com/2017/12/n ... meboy.html
- Posts: 760
- Joined: Wed Feb 13, 2008 9:10 am
- Location: Estonia, Rapla city (50 and 60Hz compatible :P)
7 cycles for directly speficied element isn't bad if you cannot reach your goal in the next INC or DEC... still beats IX+x modes by several cycles (14 vs 19). You'll still have to put your stuff in 256 byte pages though, but that's not a limitation, and crossing the page would incur update to top part which can just be INC HL (6 cycles), and similar to page crossing penalty on 65x.
If you have (say) 16 structures each 32 bytes in size, and structures are stored in consecutive addresses aligned to 32-byte boundaries, there are eight different L values that correspond to a particular H value. Or would you recommend interleaving several different unrelated arrays, as I mentioned earlier?TmEE wrote:For random accesses within the space HL has already set up, what's wrong with directly specifying values to L (or H for that matter) if the next element is further away than the nearest neighbor so single INC/DEC isn't enough ?
In the 2D approach, you might have actor properties in L=$00-$1F of each page and other unrelated things in L=$20-$FF of the same pages. (This wouldn't work for ColecoVision and SG-1000 because only four pages exist.)tepples wrote:Would I have to consider the RAM as an "atlas" analogous to a texture atlas, with RAM being a two-dimensional array divided into rectangular sub-arrays, each with object index as one dimension, field index as the other dimension, and a stride of 256 bytes?