It is currently Sun Oct 22, 2017 6:54 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Wed Aug 30, 2017 5:48 am 
Offline

Joined: Tue May 17, 2016 10:15 pm
Posts: 52
Don't really understand why all documentations says that DMA takes 160 microseconds,
and that the default wait loop waits about that long.

But if i am not mistaken, it's doing 40*16 cycles, 4 for decrementing A and 12 for JR NZ.
And that's 640 cycles, which is not 160 microseconds, as that would be about 671 right?

Am i missing something?


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 8:56 am 
Offline

Joined: Sun Mar 27, 2011 10:49 am
Posts: 192
In fact it's 39 * 16 + 36 = 660 cycles in HRAM, isn't it? Because the final JR isn't taken (- 4), but there's the LD A (+ 8) and a RET (+ 16). That still comes in under 160ms though.

So what you're saying makes sense - the pandocs seem to miscount the cycles for the JR as justification, but the official docs use the same code and give the correct cycle count, eliding over the fact that the routine wouldn't take 160 micros.

But at any rate, that code does seem to work, and it would fail hard if the delay wasn't long enough - the CPU would read garbage once you RET'd to ROM and the program would almost certainly crash. But I've used that code myself on real hardware and it's worked fine. Since it's the official Nintendo code, and since Pan & co. probably got it by disassembling commercial ROMs, I assume real games use it too and they obviously work.

So if I had to guess, the 160 micros number that Nintendo gave (and that Pan et al copied?) is imprecise - it probably takes 160 * 4 cycles rather than 160 micros.

Maybe looking through gambatte's source code would reveal more? I tried to myself but it got a little thorny.

Or if you want to be sure, you could write a test that lets you adjust the wait time and reduce it til the CPU crashes to find the correct cut-off.


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 9:21 am 
Offline

Joined: Sat Aug 28, 2010 9:01 am
Posts: 190
You are confusing clock cycles and instruction cycles. The master clock runs at ~4 MHz (Actually 4 MiHz, or 4*1024*1024 Hz to be precise, but that's not really important for the argument.) However, instructions always take a multiple of 4 clock cycles, so instruction timing is often counted in machine cycles, where one nop is said to take 1 machine cycle and so on. 1 machine cycle takes ~1 us, whereas 1 clock cycle obviously takes ~0.25 us.

In short, you have to divide your 640 figure by 4, which indeed gives 160 us.

_________________
Gameboy Genius (Blog) - Gameboy development forum (+wiki and file area)


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 9:26 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19115
Location: NE Indiana, USA (NTSC)
nitro2k01 wrote:
zerowalker wrote:
And that's 640 cycles, which is not 160 microseconds, as that would be about 671 right?

The master clock runs at ~4 MHz (Actually 4 MiHz, or 4*1024*1024 Hz to be precise, but that's not really important for the argument.)

I saw that as exactly the argument, as 640 * 1.024 * 1.024 = 671. So I'm inclined to believe adam_smasher's explanation of imprecision between the 4 MHz nominal and 4.194 MHz actual clock rates, where a "micro" is 10^9/2^20 = 954 ns.


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 9:27 am 
Offline

Joined: Sat Aug 28, 2010 9:01 am
Posts: 190
Yes, I misread the post. And yes, Martin seems to round 1 instruction cycle to 1 us.

_________________
Gameboy Genius (Blog) - Gameboy development forum (+wiki and file area)


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 9:33 am 
Offline

Joined: Tue May 17, 2016 10:15 pm
Posts: 52
Wait this got a bit confusing hehe.

I always count in cycles (not instruction cycles), so 4 being the minimum.

So is 640 cycles is correct then?

Cause that's how much mine takes until it's out of the wait loop.

It's really hard when the documentation is so fluffy at times, i usually read all over the place and try to figure out which one is correct xd.
Not a fan of checking other ppls source code as i would like to learn to write the code myself and how the system works, though i do at times.


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 10:06 am 
Offline

Joined: Fri Oct 16, 2015 6:18 am
Posts: 39
It's even more confusing when you consider the fact that the total duration is actually 161 machine cycles if you count from the DMA register write ;) (at least on DMG/SGB/MGB/SGB2).

When OAM DMA is started, there is one machine cycle delay before the actual transfer starts. So, let's say you start OAM DMA with the LDH ($46), a instruction. These are the machine cycles that happen:

Code:
------------------------------------------------------------------
 -3: opcode read and decoding of LDH (n), a (= $E0 is read)
 -2: memory read of the DMA register address (= $46 is read)
 -1: memory write to the DMA register (= value of A is written)
------------------------------------------------------------------
  0: CPU continues to the next instruction. OAM DMA has not really started yet so the OAM area is still accessible during this one cycle.
  1: first cycle of OAM DMA. OAM area is inaccessible
  2: second cycle of OAM DMA. OAM area is inaccessible
...
...
160: 160th cycle of OAM DMA. OAM area is inaccessible
------------------------------------------------------------------
161: OAM DMA is no longer running so the OAM area is now accessible


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 11:18 am 
Offline

Joined: Tue May 17, 2016 10:15 pm
Posts: 52
is -3 to 0 one instruction?

isn't cycles bundled together in the entire instruction?


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 11:31 am 
Offline

Joined: Sun Mar 27, 2011 10:49 am
Posts: 192
Not sure if this is what you're asking, but -3 to -1 inclusive are LDH [$46], A.

They're not (usually) externally visible, but internal state changes happen during each constituent cycle of the instruction - the CPU isn't just sitting around waiting N cycles and then instantly executing the instruction.


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 11:53 am 
Offline

Joined: Tue May 17, 2016 10:15 pm
Posts: 52
Yeah that's what i am asking.

And that part i get, but was wondering about the Timer and Interruption.
Those are updated on a per instruction basis right?

At least the effect they do?


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 12:27 pm 
Offline

Joined: Fri Oct 16, 2015 6:18 am
Posts: 39
I just chose the numbers to reflect the relative machine cycle position compared to the "first OAM DMA cycle".
Here's a real hardware trace of a case almost identical to the one posted earlier:

Image

The CPU is running these instructions:

Code:
$0150: LD A, $40
$0151: LDH ($46), A
$0154: NOP


You can see the OAM DMA accessing $4000 and $4001 in the end...note the one cycle delay during which the CPU executes a NOP in this example.

(if you're curious about what the CPU executes after the NOP, the answer is that the last two machine cycles actually involve an OAM DMA conflict in this case, but that's a story for another day...)


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 12:43 pm 
Offline

Joined: Tue May 17, 2016 10:15 pm
Posts: 52
So the first instruction, the CPU actually reads (FF) in 1 cycle, then it reads the remaning in the other 3 cycles.
After that it does the FF thing again, then do the load for 3 cycles?

How does the TIMA and Interruption work in these cases, aren't those essentially pseudo-async to the CPU?
I mean the CPU must somehow check the data there, but if it does things internally like this, when does that check occur, every cycle, or after every complete instruction?


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 12:44 pm 
Offline

Joined: Fri Oct 16, 2015 6:18 am
Posts: 39
Quote:
And that part i get, but was wondering about the Timer and Interruption.
Those are updated on a per instruction basis right?


Timer (TIMA) works at M-cycle granularity (not at instruction granularity!).
The real interrupt sources in the system work at various granularities all the way down to half T-cycles. However, the CPU checks interrupts only at the start of instructions so the CPU never dispatches to an interrupt handler in the middle of an instruction.


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 12:48 pm 
Offline

Joined: Fri Oct 16, 2015 6:18 am
Posts: 39
zerowalker wrote:
So the first instruction, the CPU actually reads (FF) in 1 cycle, then it reads the remaning in the other 3 cycles.
After that it does the FF thing again, then do the load for 3 cycles?


I think you're mixing T-cycles and M-cycles. In this screenshot CLK = T-cycles, PHI = M-cycles. None of the instructions here contain the byte $FF.
During M-cycle labeled -5 the CPU reads $3E. During M-cycle -4 the CPU reads $40.


Top
 Profile  
 
PostPosted: Wed Aug 30, 2017 12:50 pm 
Offline

Joined: Tue May 17, 2016 10:15 pm
Posts: 52
Quote:
Timer (TIMA) works at M-cycle granularity (not at instruction granularity!).


With that you mean it has it's own clock right?

So, if the say in the next 8 cycles the TIMA will overflow and trigger an interruption.

The CPU's next Instruction happens to be 16 cycles.

So it would first check the state before doing the instruction.
Then do it (+16 cycles).

Then repeat, and this time an interruption occurs, it then call a jump to the RST (if enabled), which takes 20 cycles.
then does the stuff, then it get's back to to the instruction that was after those 16 cycles before?

//EDIT:

My bad i read the "FF" on the image, below 80/81.

but wait, T-Cycles is the "real cycles" right?
and M-Cycles is 4 T-Cycles, cause everything is dividable by 4 right?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 16 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group