Top and bottom halves of 6502 clock cycle

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

tepples
Posts: 21751
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Sun Sep 17, 2006 3:26 pm

In this post, Fx3 wondered why a CPU tester was displaying +1 cycle or -1 cycle.
Fx3 wrote:Too bad here, unfortunately. I call "CPU clock" one PPU access, rendering 3 pixels. My CPU core is simple regarding the instruction set, of how each opcode is emulated. For some obscure reason, your test ROM is giving +1 cycle error for opcodes $01 and $04 (right now). Opcode $04 is odd... if I take out 1 PPU access, it displays -1 cycle; else, +1 cycle. Go figure... :( Any help?
Every 6502 cycle has a top half (clock = HI) and a bottom half (clock = LO). A read or write may take effect only at one half.
Last edited by tepples on Tue Sep 19, 2006 7:02 am, edited 1 time in total.

User avatar
Zepper
Formerly Fx3
Posts: 3190
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Post by Zepper » Mon Sep 18, 2006 9:24 am

What do you mean? Could you give me an example of this?

User avatar
kyuusaku
Posts: 1665
Joined: Mon Sep 27, 2004 2:13 pm

Post by kyuusaku » Mon Sep 18, 2006 11:43 am

Something can happen at the rising edge of the clock and/or at the falling edge of a clock. For a D flip-flop triggered by a rising edge, you can invert the clock to trigger it on the falling edge.

User avatar
Zepper
Formerly Fx3
Posts: 3190
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Post by Zepper » Mon Sep 18, 2006 3:17 pm

No.

I mean using emulation terms. If an instruction takes 1 cycle to read the opcode, 1 cycle to read the argument (next byte, address), 1 cycle to read from address XXh and 1 cycle to do the operation... what am I missing here? How could I think about rising/falling edge of CPU cycles? Looks like that weird MMC3 IRQ clocking! :(

mattmatteh
Posts: 345
Joined: Fri Jul 29, 2005 3:40 pm
Location: near chicago
Contact:

Post by mattmatteh » Mon Sep 18, 2006 4:48 pm

the read or write doesnt happen for the whole cycle. only half will have the read or write asserted. and changes only happen on the rising or falling edge.

matt

User avatar
Zepper
Formerly Fx3
Posts: 3190
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Post by Zepper » Mon Sep 18, 2006 4:51 pm

No++.
I need an example. I don't understand the meaning of rising/falling edge of CPU cycles.

tepples
Posts: 21751
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Mon Sep 18, 2006 5:02 pm

The CPU clock signal is roughly a square wave:

Code: Select all

 high clock state                   rising edge
      vvvvv                              v
      _____       _____       _____       _____
     |     |     |     |     |     |     |     |
_____|     |_____|     |_____|     |_____|     |_ ...

            ^^^^^                  ^
       low clock state        falling edge

Each cycle consists of a rising edge, a high state, a falling edge, and a low state. Different things can be defined to happen on the rising or falling edge of the clock signal. For example, when performing a memory read, a CPU may put an address on the address bus on a rising edge and then read the data bus on the next falling edge.

Now if you have two processors running at different speeds:

Code: Select all

CPU
      _____       _____       _____       _____
     |     |     |     |     |     |     |     |
_____|     |_____|     |_____|     |_____|     |_ ...

PPU
  _   _   _   _   _   _   _   _   _   _   _   _ 
 | | | | | | | | | | | | | | | | | | | | | | | |
_| |_| |_| |_| |_| |_| |_| |_| |_| |_| |_| |_| |_ ...
Then the rising and falling edges of a single CPU cycle may occur a PPU cycle or two apart, which may cause a read or write to appear to be delayed by one or two PPU cycles (a fraction of a CPU cycle).

User avatar
Zepper
Formerly Fx3
Posts: 3190
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Post by Zepper » Mon Sep 18, 2006 5:48 pm

Awesome. ^_^;;
By the way, in terms of emulation of a certain instruction (like LDA $00), how a rising/falling edge would be detected?

tepples
Posts: 21751
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Tue Sep 19, 2006 6:03 pm

LDA $00 has six edges, more or less like the following:
  • a rise and fall for reading opcode LDA,
  • a rise and fall for reading address $00,
  • a rise while address $0000 is put on the address bus, and
  • a fall while the value is read from the data bus.
Each rise or fall may affect other hardware connected to the CPU bus, such as the PPU registers.

User avatar
Zepper
Formerly Fx3
Posts: 3190
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Post by Zepper » Tue Sep 19, 2006 7:05 pm

tepples wrote:LDA $00 has six edges, more or less like the following:
  • a rise and fall for reading opcode LDA,
  • a rise and fall for reading address $00,
  • a rise while address $0000 is put on the address bus, and
  • a fall while the value is read from the data bus.
Each rise or fall may affect other hardware connected to the CPU bus, such as the PPU registers.
Hmm... interesting. By the way, as far as I know, there's no public docs that cover this rising/falling thing, but only "numeric" CPU cycles as units. Plus, it's picky to emulate something like 1 cycle being broken into 2 steps (rise/fall). Anyway, it makes sense as I could detect unexpected errors in most of other blargg's tests.

The LDA $xx takes 1 cycle to fetch the opcode, 1 cycle to fetch the immediate byte and 1 cycle to read from RAM[$xx]. By the way, each cycle takes 2 'steps'. This way, I must create a SINGLE PPU access function to correct the problem. By default, mine takes 3 PPU cycles per CPU cycle, and it seems incorrect... -_-;; That's it. Thanks for the help.

tepples
Posts: 21751
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Tue Sep 19, 2006 7:34 pm

Fx3 wrote:By the way, as far as I know, there's no public docs that cover this rising/falling thing
The timing diagrams of the 6502 data sheet should work nicely.

User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg » Tue Sep 19, 2006 11:31 pm

I thought the 6502 also used a two-phase clock like this, where actions occur on the rising edge of each of the phases:

Code: Select all

  ___     ___     ___   
_|   |___|   |___|   |___
    ___     ___     ___   
___|   |___|   |___|   |___
But perhaps this is just another way of describing a single clock at double the rate with actions occurring on the rising and falling edges.

I still don't see how this would help explain Fx3's problem with instruction timing.

User avatar
Zepper
Formerly Fx3
Posts: 3190
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Post by Zepper » Wed Sep 20, 2006 2:53 pm

blargg wrote:I still don't see how this would help explain Fx3's problem with instruction timing.
Just forgive me. ^_^;; Ah yes, thanks for the test ROMs, it's awesome, no joke.

tepples
Posts: 21751
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples » Wed Sep 20, 2006 3:41 pm

blargg wrote:I still don't see how this would help explain Fx3's problem with instruction timing.
Any test ROM that measures the timing of an instruction by comparing it with the length of a PPU frame will have different behavior if the NMI comes a half-cycle early or late, especially if that triggers the bug where a PPUSTATUS read cancels NMI.

User avatar
blargg
Posts: 3715
Joined: Mon Sep 27, 2004 8:33 am
Location: Central Texas, USA
Contact:

Post by blargg » Thu Sep 21, 2006 1:18 am

My CPU timing test has large margins for timing, since it uses NMI to time thousands of executions of the instruction, not just one. It further allows an error of up to +/- 6 iterations of the loop as compared to the reference values. For instructions which differ in execution by one clock, the iteration count in differs by at least by 200.

In other timing tests which do timing down to 1 PPU clock accuracy, they first synchronize the CPU clock with the PPU clock such that the error is at most 3/4 PPU clock. The PPU clock is master / 4, the CPU master / 12, so there are four different possible fixed synchronizations at power-up, depending on the random state of the dividers (P = PPU, C = CPU, one character = one ~21.5 MHz master clock):

Code: Select all

P---P---P---P---P---P---
C-----------C-----------
-C-----------C----------
--C-----------C---------
---C-----------C--------

Post Reply