OAM/DMC DMA tests

Discuss emulation of the Nintendo Entertainment System and Famicom.

Moderator: Moderators

User avatar
Disch
Posts: 1848
Joined: Wed Nov 10, 2004 6:47 pm

OAM/DMC DMA tests

Post by Disch »

cpow ran some tests with visual2a03 to delve further into DMA behavior. Though his notes were a bit scattered for my tastes, and left me with a few unanswered questions.

So using his notes as a base, I ran my own tests on visual2a03 to verify them, and to fill in a few of the gaps.

Here are my findings:

Code: Select all

==========================================
==========================================
==========================================
General
==========================================


DMA unit alternates between 'get' cycles and 'put' cycles.  Values are read on 'get' cycles and written on 'put' cycles.  'get' cycles can never write -- 'put' cycles can read, but discard any value read.  DMA unit seems to alternate between get/put even when DMA is not active -- effectively meaning that even cycles are 'get' cycles, and odd cycles are 'put' cycles.

"Dummy reads" ALWAYS seem to be performed from whatever address the CPU will want to read from next -- that is, whatever address will be read from once the DMA is complete.


When the DMA unit needs to cut into the CPU, it begins a 'halt' process.  The process appears to be as follows:

    1)  The 'halt attempt' cycle -- Let the CPU start its next cycle.  
        a)  If this cycle is a write, perform it normally.  Repeat step 1
        b)  If this cycle is a read, hijack the read, discard the value, and prevent all other actions that occur on this cycle (PC not incremented, etc).
                Presumably, side-effects from performing the read still occur.  Proceed to step 2
                
    2)  For DMC DMA ONLY -- do another dummy read, discarding the result.
    
    3)  If the DMA unit is currently on a 'put' cycle, do another dummy read  ('alignment' cycle)
    
    4)  Actually perform the DMA
        a)  For DMC, this performs a single read cycle, then returns control to main CPU logic
        b)  For OAM, this performs 256 alternating reads/writes as you'd expect
        
        
Note that the DMA is effectively delayed as it waits for all CPU write cycles to complete.  Though this is just a delay, and does not actually alter the length of the DMA

What DOES alter the length of the DMA is the optional alignment cycle.



==========================================
==========================================
==========================================
DMC
==========================================

DMC DMAs appear to try to halt during the 'put' phase -- meaning they will take 4 cycles normally:
    1) 'put' - halt
    2) 'get' - extra DMC dummy read
    3) 'put' - dummy cycle for alignment
    4) 'get' - DMA
    
When DMC halt happens "on a write cycle", this makes it take 3 cycles because alignment can be skipped:
    *) 'put' - initial halt attempt -- but if a write cycle, it's delayed
    1) 'get' - attempt to halt again -- successful this time because it's a read cycle
    2) 'put' - extra DMC dummy read
            alignment not needed
    3) 'get' - DMA
    
    (note the '*' write cycle is performed normally, and therefore does not count as a stolen cycle, hence DMC only steals 3 cycles here)
    
    
However the DMC will steal 4 cycles if it attempts to halt during the first write of a RMW instruction (INC/DEC/etc)
    *) 'put' - halt attempt - fails because CPU is writing (first RMW write)
    *) 'get' - halt attempt - fails because CPU is writing (second RMW write)
    1) 'put' - halt attempt - successful
    2) 'get' - extra DMC dummy read
    3) 'put' - alignment
    4) 'get' - DMA

The above logic matches for 3 consecutive writes (interrupts/BRK).  If the halt is during the 1st or 3rd write, it'll steal 3 cycles... but if it's during the 2nd write, it'll steal 4.

==========================================
==========================================
==========================================
OAM / 4014
==========================================
    
OAM DMA behaves similarly, but skips the DMC-only dummy read.  Meaning OAM will take 513 / 514 cycles depending on whether or not
the alignment cycle is needed

Assuming the write is performed with a STA/STX/STY:
    *)  $4014 write cycle triggering OAM DMA
    1)  halt attempt - successful (next cycle is a read for the next opcode, or is an interrupt)
    ?)  possible alignment
    2)  'get' - read 1st byte
    3)  'put' - write 1st byte
        ...
        
        
Writing to 4014 twice consecutively (INC/DEC/etc) holds expected logic.  Both writes will perform, followed by the halt cycle,
possible alignment, then 512 cycles of DMA.



==========================================
==========================================
==========================================
Both at the same time
==========================================

Things to note:
- DMC DMA trumps OAM DMA
- A DMC halt is considered successful if it happens on an OAM DMA cycle
- under no circumstances can a DMC DMA cycle immediately follow a successful DMC halt cycle.  There must be at least 1 dummy cycle, alignment cycle, or OAM DMA cycle between the halt and the DMA.


Examples:
    Cycles marked with '+' are "DMC stolen"
    p = must be a 'put' cycle (remember DMC always halts on a put cycle)
    g = must be a 'get' cycle 
    * = normal, unaffected CPU cycle

DMC halts on the $4014 write cycle:
    p *)    $4014 write - unsuccessful DMC halt
    g 1)    DMC & OAM halt -- successful
    p 2)    DMC dummy / alignment      not a DMC stolen cycle, since this would have to be alignment regardless
    g+3)    DMC DMA
    p+4)    re-alignment
    g 5)    OAM DMA read 1
    p 6)    OAM DMA write 1
            ...
        
DMC halts 1 cycle after $4014 write:
    g *)    $4014 write
    p 1)    DMC & OAM halt - successful
    g 2)    OAM read 1      (DMC dummy)
    p 3)    OAM write 1     (DMC alignment)
    g+4)    DMC
    p+5)    re-alignment
    g 6)    OAM read 2
    p 7)    OAM write 2
    ...
    
    
This logic follows for 2 consecutive writes to $4014.




==========================================
==========================================
==========================================
What I was not able to test
==========================================


Visual2a03 gave EXTREMELY weird behavior for OAM DMA.  I suspect it needs more warmup time.  OAM DMA was fetching from the wrong address, and the address being read from was being mangled by DMC DMAs, which was resulting in 700+ stolen cycles... and would also result in extremely corrupted sprites on a real system.  Because of this I was unable to test the following:

1)  What happens on edge case when DMC DMA occurs at the very end of OAM DMA?
2)  If you INC $4014, does it DMA from the pre-incremented value or post-incremented?

I also uploaded the unpolished / scattered notes which has the results of my tests as well as links to the test programs I ran:

https://www.dropbox.com/s/afvbxers66v9994/dma.txt?dl=0
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: OAM/DMC DMA tests

Post by lidnariq »

I just want to point out that what you describe as "halt attempt" is describing what the 6502 RDY pin does.
User avatar
Zepper
Formerly Fx3
Posts: 3262
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Re: OAM/DMC DMA tests

Post by Zepper »

If you INC $4014, does it DMA from the pre-incremented value or post-incremented?
Both. It'll perform a read from $4014 (dummy, should return $40), then write it to $4014 (DMA-ing). The value $40 becomes $41 and another DMA is performed. Now waiting for someone to correct me. :mrgreen: :roll:

Well, that's the expected operation considerating the INC timing diagram. What happens during the 1st or 2nd sprite DMA is another subject. :)
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: OAM/DMC DMA tests

Post by lidnariq »

DMA can't start until R/W goes high, because that's how the 6502 RDY signal works.
It definitely only does one of the two DMAs, but I don't remember which.
User avatar
Zepper
Formerly Fx3
Posts: 3262
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Re: OAM/DMC DMA tests

Post by Zepper »

lidnariq wrote:DMA can't start until R/W goes high, because that's how the 6502 RDY signal works.
It definitely only does one of the two DMAs, but I don't remember which.
So... an INC $4014 wouldn't be correctly performed in the current emulators. Why would it fail in the first write to $4014? A read from $4014 returns $40; then a write should occur. Where is my mistake?
User avatar
Disch
Posts: 1848
Joined: Wed Nov 10, 2004 6:47 pm

Re: OAM/DMC DMA tests

Post by Disch »

Zepper wrote:So... an INC $4014 wouldn't be correctly performed in the current emulators. Why would it fail in the first write to $4014? A read from $4014 returns $40; then a write should occur. Where is my mistake?
You can think of it this way:

A $4014 write does not immediately perform a DMA. Instead, it sets a flag to indicate that a DMA should start on the next CPU read cycle.

If two $4014 writes happen back-to-back (as with INC), they both set the flag, but no DMA has been performed yet because there hasn't been a read cycle.

lidnariq wrote:I just want to point out that what you describe as "halt attempt" is describing what the 6502 RDY pin does.
Yeah I'm pretty sure a lot of this info was known already. I know cpow posted several posts where he tested a lot of this and came up with similar results. I just wanted to test it myself because the information was scattered and often unclear.

I get feedback sometimes that people like when I make these kinds of summaries because I guess the way I explain it is helpful? *shrug* So I figured I'd post it. =P

If nothing else, doing these tests has certainly helped ME understand it.
User avatar
Zepper
Formerly Fx3
Posts: 3262
Joined: Fri Nov 12, 2004 4:59 pm
Location: Brazil
Contact:

Re: OAM/DMC DMA tests

Post by Zepper »

Disch wrote:A $4014 write does not immediately perform a DMA. Instead, it sets a flag to indicate that a DMA should start on the next CPU read cycle.

If two $4014 writes happen back-to-back (as with INC), they both set the flag, but no DMA has been performed yet because there hasn't been a read cycle.
Wait. Lemme think...
1. read instruction opcode ($EE)
2. read low byte of address ($14)
3. read high byte of address ($40)
4. read from effective address ($4014 should return $40)
5. write the value back to effective address (should set the SPRDMA flag?)
6. do the operation (INC) and write the new value ($41) to the effective address (SPRDMA flag already set)
*7. read the next instruction opcode (trigger SPRDMA?)
Disch wrote:I get feedback sometimes that people like when I make these kinds of summaries because I guess the way I explain it is helpful? *shrug* So I figured I'd post it. =P
Always welcome. ^_^;;

EDIT: *I mean... if the SPRDMA flag is waiting for a CPU read to trigger, it should do on the first cycle of the next instruction... no??
User avatar
Disch
Posts: 1848
Joined: Wed Nov 10, 2004 6:47 pm

Re: OAM/DMC DMA tests

Post by Disch »

Your summary looks correct.


- write: 1st $4014 write (original value)
- write: 2nd $4014 write (new value)
- read: next opcode -- but DMA halts here, so this read is thrown away
> optional read: next opcode again & thrown away. This only done if this is an odd cycle (DMA unit has to align so reads are on even cycles)
- read: $xx00
- write: $2004
- read: $xx01
- write: $2004
...
- read: $xxFF
- write: $2004
- read: next opcode
Rahsennor
Posts: 479
Joined: Thu Aug 20, 2015 3:09 am

Re: OAM/DMC DMA tests

Post by Rahsennor »

Disch wrote:I get feedback sometimes that people like when I make these kinds of summaries because I guess the way I explain it is helpful? *shrug* So I figured I'd post it. =P

If nothing else, doing these tests has certainly helped ME understand it.
I've been wanting to make the DMC reads in my NSF player accurate for some time now (not that it actually matters for an NSF player), and this is the first time anyone has described it in terms I can understand. Thank you muchly.

One quick question: is the DMA put/get cycle in any way related to the APU clock? Are they aligned in a specific way, or can the alignment change between resets?
User avatar
Disch
Posts: 1848
Joined: Wed Nov 10, 2004 6:47 pm

Re: OAM/DMC DMA tests

Post by Disch »

Glad it's useful! :mrgreen:

Regarding your questions: I have no idea. This is something I wouldn't want to test on visual2a03 -- but would want to do on a real cart... and making an actual test ROM for it would be a lot of work =x
Rahsennor
Posts: 479
Joined: Thu Aug 20, 2015 3:09 am

Re: OAM/DMC DMA tests

Post by Rahsennor »

I was hoping someone might have traced the circuit to see if they were driven by the same divider. Never mind.

The reason I asked is that if the APU (and therefore DMC output unit) always clocks on a "get" cycle, that might explain why the DMC DMA always starts on a "put" cycle.
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: OAM/DMC DMA tests

Post by lidnariq »

They're the same divider. In Visual2A03 they're called apu_clk1 and apu_clk2e and control the OAM DMA cadence. Tracing down the exact even/odd clock used by the DPCM DMA is more of a pain, but I see apu_/clk2 poking around it.
Rahsennor
Posts: 479
Joined: Thu Aug 20, 2015 3:09 am

Re: OAM/DMC DMA tests

Post by Rahsennor »

Chip internals are way beyond me, so thanks for looking into it.

I implemented the above logic for DMC DMA in my NSF player, and got... no observable result, as expected. :P It still passes all the same tests as it did before, but I don't have anything picky enough to spot the difference.
fred
Posts: 67
Joined: Fri Dec 30, 2011 7:15 am
Location: Sweden

Re: OAM/DMC DMA tests

Post by fred »

2) If you INC $4014, does it DMA from the pre-incremented value or post-incremented?
Visual2A03 seems to be able to DMA only from zero page, even though I can see enough memory for two pages. Hmm...


Btw, can someone clarify for me what the wiki means here regarding DMAs? "1 dummy read cycle while waiting for writes to complete, +1 if on an odd CPU cycle, then 256 alternating read/write cycles."
Does the "while waiting for writes to complete" refer to what happens if you R&W(2 write cycles) to 0x4014? Isn't it easier to say that a DMA (and only one DMA) starts after an instruction finishes? Did I miss something here or is it referring to something else?
Last edited by fred on Mon Apr 25, 2016 1:21 am, edited 1 time in total.
Rahsennor
Posts: 479
Joined: Thu Aug 20, 2015 3:09 am

Re: OAM/DMC DMA tests

Post by Rahsennor »

fred wrote:Isn't it easier to say that a DMA (and only one DMA) starts after an instruction finishes? Did I miss something here or is it referring to something else?
The DMA starts after the writes finish. The DMA unit has no knowledge of where instructions start and end. It can only watch the bus to see if the CPU is writing or reading, and it can only pause the CPU after a read, so it always waits for a single read to complete before it starts. The result of that read is then discarded, turning it into a dummy read.
Post Reply