DMA Transfer - 160 microseconds?

Discussion of programming and development for the original Game Boy and Game Boy Color.
zerowalker
Posts: 68
Joined: Tue May 17, 2016 10:15 pm

DMA Transfer - 160 microseconds?

Post by zerowalker »

Don't really understand why all documentations says that DMA takes 160 microseconds,
and that the default wait loop waits about that long.

But if i am not mistaken, it's doing 40*16 cycles, 4 for decrementing A and 12 for JR NZ.
And that's 640 cycles, which is not 160 microseconds, as that would be about 671 right?

Am i missing something?
adam_smasher
Posts: 271
Joined: Sun Mar 27, 2011 10:49 am
Location: Victoria, BC

Re: DMA Transfer - 160 microseconds?

Post by adam_smasher »

In fact it's 39 * 16 + 36 = 660 cycles in HRAM, isn't it? Because the final JR isn't taken (- 4), but there's the LD A (+ 8) and a RET (+ 16). That still comes in under 160ms though.

So what you're saying makes sense - the pandocs seem to miscount the cycles for the JR as justification, but the official docs use the same code and give the correct cycle count, eliding over the fact that the routine wouldn't take 160 micros.

But at any rate, that code does seem to work, and it would fail hard if the delay wasn't long enough - the CPU would read garbage once you RET'd to ROM and the program would almost certainly crash. But I've used that code myself on real hardware and it's worked fine. Since it's the official Nintendo code, and since Pan & co. probably got it by disassembling commercial ROMs, I assume real games use it too and they obviously work.

So if I had to guess, the 160 micros number that Nintendo gave (and that Pan et al copied?) is imprecise - it probably takes 160 * 4 cycles rather than 160 micros.

Maybe looking through gambatte's source code would reveal more? I tried to myself but it got a little thorny.

Or if you want to be sure, you could write a test that lets you adjust the wait time and reduce it til the CPU crashes to find the correct cut-off.
nitro2k01
Posts: 252
Joined: Sat Aug 28, 2010 9:01 am

Re: DMA Transfer - 160 microseconds?

Post by nitro2k01 »

You are confusing clock cycles and instruction cycles. The master clock runs at ~4 MHz (Actually 4 MiHz, or 4*1024*1024 Hz to be precise, but that's not really important for the argument.) However, instructions always take a multiple of 4 clock cycles, so instruction timing is often counted in machine cycles, where one nop is said to take 1 machine cycle and so on. 1 machine cycle takes ~1 us, whereas 1 clock cycle obviously takes ~0.25 us.

In short, you have to divide your 640 figure by 4, which indeed gives 160 us.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: DMA Transfer - 160 microseconds?

Post by tepples »

nitro2k01 wrote:
zerowalker wrote:And that's 640 cycles, which is not 160 microseconds, as that would be about 671 right?
The master clock runs at ~4 MHz (Actually 4 MiHz, or 4*1024*1024 Hz to be precise, but that's not really important for the argument.)
I saw that as exactly the argument, as 640 * 1.024 * 1.024 = 671. So I'm inclined to believe adam_smasher's explanation of imprecision between the 4 MHz nominal and 4.194 MHz actual clock rates, where a "micro" is 10^9/2^20 = 954 ns.
nitro2k01
Posts: 252
Joined: Sat Aug 28, 2010 9:01 am

Re: DMA Transfer - 160 microseconds?

Post by nitro2k01 »

Yes, I misread the post. And yes, Martin seems to round 1 instruction cycle to 1 us.
zerowalker
Posts: 68
Joined: Tue May 17, 2016 10:15 pm

Re: DMA Transfer - 160 microseconds?

Post by zerowalker »

Wait this got a bit confusing hehe.

I always count in cycles (not instruction cycles), so 4 being the minimum.

So is 640 cycles is correct then?

Cause that's how much mine takes until it's out of the wait loop.

It's really hard when the documentation is so fluffy at times, i usually read all over the place and try to figure out which one is correct xd.
Not a fan of checking other ppls source code as i would like to learn to write the code myself and how the system works, though i do at times.
gekkio
Posts: 49
Joined: Fri Oct 16, 2015 6:18 am

Re: DMA Transfer - 160 microseconds?

Post by gekkio »

It's even more confusing when you consider the fact that the total duration is actually 161 machine cycles if you count from the DMA register write ;) (at least on DMG/SGB/MGB/SGB2).

When OAM DMA is started, there is one machine cycle delay before the actual transfer starts. So, let's say you start OAM DMA with the LDH ($46), a instruction. These are the machine cycles that happen:

Code: Select all

------------------------------------------------------------------
 -3: opcode read and decoding of LDH (n), a (= $E0 is read)
 -2: memory read of the DMA register address (= $46 is read)
 -1: memory write to the DMA register (= value of A is written)
------------------------------------------------------------------
  0: CPU continues to the next instruction. OAM DMA has not really started yet so the OAM area is still accessible during this one cycle.
  1: first cycle of OAM DMA. OAM area is inaccessible
  2: second cycle of OAM DMA. OAM area is inaccessible
...
...
160: 160th cycle of OAM DMA. OAM area is inaccessible
------------------------------------------------------------------
161: OAM DMA is no longer running so the OAM area is now accessible
zerowalker
Posts: 68
Joined: Tue May 17, 2016 10:15 pm

Re: DMA Transfer - 160 microseconds?

Post by zerowalker »

is -3 to 0 one instruction?

isn't cycles bundled together in the entire instruction?
adam_smasher
Posts: 271
Joined: Sun Mar 27, 2011 10:49 am
Location: Victoria, BC

Re: DMA Transfer - 160 microseconds?

Post by adam_smasher »

Not sure if this is what you're asking, but -3 to -1 inclusive are LDH [$46], A.

They're not (usually) externally visible, but internal state changes happen during each constituent cycle of the instruction - the CPU isn't just sitting around waiting N cycles and then instantly executing the instruction.
zerowalker
Posts: 68
Joined: Tue May 17, 2016 10:15 pm

Re: DMA Transfer - 160 microseconds?

Post by zerowalker »

Yeah that's what i am asking.

And that part i get, but was wondering about the Timer and Interruption.
Those are updated on a per instruction basis right?

At least the effect they do?
gekkio
Posts: 49
Joined: Fri Oct 16, 2015 6:18 am

Re: DMA Transfer - 160 microseconds?

Post by gekkio »

I just chose the numbers to reflect the relative machine cycle position compared to the "first OAM DMA cycle".
Here's a real hardware trace of a case almost identical to the one posted earlier:

Image

The CPU is running these instructions:

Code: Select all

$0150: LD A, $40
$0151: LDH ($46), A
$0154: NOP
You can see the OAM DMA accessing $4000 and $4001 in the end...note the one cycle delay during which the CPU executes a NOP in this example.

(if you're curious about what the CPU executes after the NOP, the answer is that the last two machine cycles actually involve an OAM DMA conflict in this case, but that's a story for another day...)
zerowalker
Posts: 68
Joined: Tue May 17, 2016 10:15 pm

Re: DMA Transfer - 160 microseconds?

Post by zerowalker »

So the first instruction, the CPU actually reads (FF) in 1 cycle, then it reads the remaning in the other 3 cycles.
After that it does the FF thing again, then do the load for 3 cycles?

How does the TIMA and Interruption work in these cases, aren't those essentially pseudo-async to the CPU?
I mean the CPU must somehow check the data there, but if it does things internally like this, when does that check occur, every cycle, or after every complete instruction?
gekkio
Posts: 49
Joined: Fri Oct 16, 2015 6:18 am

Re: DMA Transfer - 160 microseconds?

Post by gekkio »

And that part i get, but was wondering about the Timer and Interruption.
Those are updated on a per instruction basis right?
Timer (TIMA) works at M-cycle granularity (not at instruction granularity!).
The real interrupt sources in the system work at various granularities all the way down to half T-cycles. However, the CPU checks interrupts only at the start of instructions so the CPU never dispatches to an interrupt handler in the middle of an instruction.
gekkio
Posts: 49
Joined: Fri Oct 16, 2015 6:18 am

Re: DMA Transfer - 160 microseconds?

Post by gekkio »

zerowalker wrote:So the first instruction, the CPU actually reads (FF) in 1 cycle, then it reads the remaning in the other 3 cycles.
After that it does the FF thing again, then do the load for 3 cycles?
I think you're mixing T-cycles and M-cycles. In this screenshot CLK = T-cycles, PHI = M-cycles. None of the instructions here contain the byte $FF.
During M-cycle labeled -5 the CPU reads $3E. During M-cycle -4 the CPU reads $40.
zerowalker
Posts: 68
Joined: Tue May 17, 2016 10:15 pm

Re: DMA Transfer - 160 microseconds?

Post by zerowalker »

Timer (TIMA) works at M-cycle granularity (not at instruction granularity!).
With that you mean it has it's own clock right?

So, if the say in the next 8 cycles the TIMA will overflow and trigger an interruption.

The CPU's next Instruction happens to be 16 cycles.

So it would first check the state before doing the instruction.
Then do it (+16 cycles).

Then repeat, and this time an interruption occurs, it then call a jump to the RST (if enabled), which takes 20 cycles.
then does the stuff, then it get's back to to the instruction that was after those 16 cycles before?

//EDIT:

My bad i read the "FF" on the image, below 80/81.

but wait, T-Cycles is the "real cycles" right?
and M-Cycles is 4 T-Cycles, cause everything is dividable by 4 right?
Post Reply