It is currently Mon Jul 23, 2018 9:01 am

 All times are UTC - 7 hours

### Forum rules

Related:
• For making cartridges of your Super NES games, see Reproduction.

 Page 1 of 1 [ 6 posts ]
 Print view Previous topic | Next topic
Author Message
 Post subject: Yet another DMA optimization threadPosted: Thu Jun 09, 2016 7:50 pm

Joined: Wed May 19, 2010 6:12 pm
Posts: 2716
So my current DMA routine is unrolled loop. The repeating part looks like this:

Code:
sta \$02                                               //4 9      0
lda.w {dma_bank}+{n},x                                //5 14     2
sta \$04                                               //4 18     0
lda.w {dma_destination}+{n},x                         //5 23     2
sta \$2116                                             //5 28     0
sty \$420b                                             //4 32     0

//32 + 6/3 = 34 fast cycles

Where's the length word? Well, it's hidden inside the top byte of the "bank" word. Since I'm only updating little individual sprites, it works because I don't have to deal with chunks bigger than 256. Unfortunately it's no longer a "general purpose" DMA routine. If we want to make it more "general purpose" while keeping the speed, we would have to optimize it even more.

Now lets PEI all over on those DMA registers!

Code:
txs                                             //2        0
pei ({dma_legnth}+{n}+1)                        //6 8      2
pei ({dma_bank}+{n})                            //6 14     2
ldy.b {dma_destination}+{n}                     //4 24     2
sty \$2116                                       //5 29     0
sta \$420b                                       //4 33     0

33 + 8/3 = 35.667 fast cycles

With the extra length word, it is slightly slower than the first method without it.

Top

 Post subject: Re: Yet another DMA optimization threadPosted: Fri Jun 10, 2016 8:12 am

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20292
Location: NE Indiana, USA (NTSC)
psycopathicteen wrote:
Now lets PEI all over on those DMA registers!

This is during SEI, correct? Because PEA/PEI on top of registers that aren't readable can be dangerous if you get an IRQ at the wrong time, and the gaps before and between DMA channels' register sets aren't readable.

How many distinct DMAs do you have per vblank? This helps determine how much you save if you prepare all 8 DMA channels (or the 6 you aren't using for HDMA) during draw time and then activate them at the start of vblank. It's probably a lot though, given that each is on the order of 64 bytes (2 tiles as top half of a 16x16) or 128 bytes (4 tiles as top half of two 16x16s). So I see how the overhead can be substantial, as DMA copies 3 bytes per 4 fast cycles, or 128 bytes in 171 cycles.

Is sprite cel VRAM so jam-packed that it would hurt to always allocate 8 16x16 sprites at a time, so you can do an entire strip as one 1024-byte chunk? If not, can sprites with adjacent destinations be coalesced to reduce \$2116 writes?

Top

 Post subject: Re: Yet another DMA optimization threadPosted: Sun Jun 12, 2016 1:42 pm

Joined: Wed May 19, 2010 6:12 pm
Posts: 2716
If I SEI before and CLI afterwards, would it work exactly the same?

Top

 Post subject: Re: Yet another DMA optimization threadPosted: Sun Jun 12, 2016 1:49 pm

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20292
Location: NE Indiana, USA (NTSC)
Yes, as long as you restore the stack pointer before you reenable interrupts.

Top

 Post subject: Re: Yet another DMA optimization threadPosted: Sun Jun 12, 2016 2:38 pm

Joined: Wed May 19, 2010 6:12 pm
Posts: 2716
I hope this would be the final optimization for DMA code. Keeping everything under vblank has always been a pain in the butt.

Top

 Post subject: Re: Yet another DMA optimization threadPosted: Fri Jun 17, 2016 10:04 am

Joined: Wed May 19, 2010 6:12 pm
Posts: 2716
You know what? I think I'll just use a separate list longer DMAs, instead of going through the hassle.

Top

 Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending
 Page 1 of 1 [ 6 posts ]

 All times are UTC - 7 hours

#### Who is online

Users browsing this forum: No registered users and 2 guests

 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum

Search for:
 Jump to:  Select a forum ------------------ NES / Famicom    NESdev    NESemdev    NES Graphics    NES Music    Homebrew Projects       2018 NESdev Competition       2017 NESdev Competition       2016 NESdev Competition       2014 NESdev Competition       2011 NESdev Competition    Newbie Help Center    NES Hardware and Flash Equipment       Reproduction    NESdev International       FCdev       NESdev China       NESdev Middle East Other    General Stuff    Membler Industries    Other Retro Dev       SNESdev       GBDev    Test Forum Site Issues    phpBB Issues    Web Issues    nesdevWiki