The bus traces I've been quoting are here:
http://baltazarstudios.com/zilog-z80-un ... -behavior/byuu wrote:
LD (I,A;A,I;A,R;R,A)
The manual states the T-cycles for these are T(4,5). I really would have expected this one to be T(4,4).
Is there really an extra cycle here?
Yes, there really is an extra T-state for all I/R moves. The pipeline stall probably has nothing to do with flags; it's more likely because R has to be incremented on every M1 cycle (opcode fetch), and therefore the next opcode fetch can't happen until the register move is complete (unlike normal register-register moves, which can be overlapped with the next opcode fetch)
The reason moves to/from I incur the stall (and not just moves to/from R) is that internally IR is a single 16-bit register, with I in the upper half and R in the lower half.
Quote:
DAA
These algorithms are always brutally difficult to get correct. I remember VBA using a lookup table for the Game Boy DAA instruction, and still getting the wrong results! (the table itself had bad values in it.)
Wasn't able to use my LR35902 implementation (known to be correct per blargg), because it's missing a lot of the flag values and said CPU doesn't have the N flag that affects the computation.
DAA is different between the LR35902 and the Z80 even apart from the Z80's additional flags. One difference (not the only one) is that on the LR35902 the upper and lower nybbles of A only affect the result of an adjust-after-add (NF == 0), but on the Z80 they affect both adjust-after-add and adjust-after-subtract.
In fact, I believe the problem with the table VBA used was that it was correct for the Z80 rather than the LR35902.
Here's my algorithm for Z80 DAA, which should be equivalent to the one you quoted, and a fair bit simpler and easier to understand:
(Edit: Simplified by doing the adjusts directly on A, taking advantage of the fact that the upper nybble adjust doesn't affect the test for the lower nybble adjust. Note that the converse is
not true; the lower nybble adjust would mess up the upper nybble test if you did it first, so don't change the order of the first two lines!)
Code:
uint oldA = A; // save the previous value of A to calculate HF with
if (CF || (A > 0x99)) { A += (NF ? -0x60 : 0x60); CF = 1; } // if carry set or A > BCD 99, adjust upper nybble and set carry
if (HF || (A.bits(0,3) > 9)) { A += (NF ? -6 : 6); } // if half-carry set or lower nybble > 9, adjust lower nybble
HF = (A ^ oldA).bit(4); // half-carry is set if bit 4 changed, otherwise cleared
// the rest of the flags are set the usual way for an ALU operation (except that PF is parity, rather than overflow like you'd expect...)
// note that unlike the LR35902, NF is preserved
PF = parity(A);
XF = A.bit(3);
YF = A.bit(5);
ZF = A == 0;
SF = A.bit(7);
Here's a side-by-side tester I whipped up in Python, omitting flag calculations that depend solely on the resulting value of A. It passes, but someone should double-check my translation of byuu's algorithm (and mine) from nalled C++ to Python:
Code:
#!/usr/bin/python3
def mydaa(A, CF, HF, NF):
oldA = A
if CF or A > 0x99:
A += -0x60 if NF else 0x60
CF = True
if HF or A & 0xf > 9:
A += -6 if NF else 6
HF = bool((A ^ oldA) & 0x10)
return A, CF, HF
def byuudaa(A, CF, HF, NF):
lo, hi = A & 0xf, A >> 4
if CF:
diff = 0x66 if HF or lo > 9 else 0x60
elif lo >= 10:
diff = 0x66 if hi > 8 else 0x06
elif hi >= 10:
diff = 0x66 if HF else 0x60
else:
diff = 0x06 if HF else 0
A = A - diff if NF else A + diff
CF = CF or (hi >= (10 if lo <= 9 else 9))
HF = (HF and lo <= 5) if NF else (lo >= 10)
return A, CF, HF
myresult = [(A, CF, HF, NF, mydaa(A, CF, HF, NF))
for A in range(256)
for CF in (False, True)
for HF in (False, True)
for NF in (False, True)]
byuuresult = [(A, CF, HF, NF, byuudaa(A, CF, HF, NF))
for A in range(256)
for CF in (False, True)
for HF in (False, True)
for NF in (False, True)]
passed = True
for mine, byuu in zip(myresult, byuuresult):
if mine != byuu:
print("mine: " + repr(mine))
print("byuu: " + repr(byuu))
passed = False
if passed:
print("Passed")
Quote:
RLD / RRD
It was easy enough to see what these were doing via CZ80, but ... what the hell are these useful for? >_>
Just like the LR35902 SWAP instruction, they're useful for working with nybble-sized data (e.g. BCD decimal numbers).