Brainstorming using Microcontroller as mapper

Discuss hardware-related topics, such as development cartridges, CopyNES, PowerPak, EPROMs, or whatever.

Moderator: Moderators

User avatar
Ben Boldt
Posts: 1148
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Brainstorming using Microcontroller as mapper

Post by Ben Boldt »

Memory mappers generally take in a few of the high CPU address bits and map each combination to a larger number of PRG address bits. (And same for the PPU.) Basically a bunch of writable latches that propagate the top CPU address bits to the PRG address bits asynchronously. That asynchronous behavior is what makes logic gates, CPLDs and FPGAs a really good fit. We don't worry much about timing when we do it asynchronously like that.

In modern times, fast 32-bit microcontrollers with 5V-tolerant I/O have become very inexpensive and thus appealing. Potentially it could be a cheap 1-chip mapper if it can be made to work. The problem is that microcontrollers can't really do that asynchronous stuff. But maybe they are fast enough now to work around that. As long as it works, and the board is simple and cheap to make, why should we care how hard that micro has to work to make it happen?

So the biggest hurdle I see is taking those high order CPU and PPU address lines as inputs and very quickly reacting and pushing out the correct PRG and CHR address bits. This would be happening on the order of MHz which is asking a lot of a microcontroller, but maybe attainable these days with 100+ MHz micros becoming common. I would want to trigger interrupts somehow rather than infinite loop polling so that other things (such as emulating audio expansion) could be going on at the same time. Here is where I want to ask you guys that are a lot more familiar with the sequencing of the Famicom than I am.

Method 1:
Change notification interrupts on all the high order CPU and PPU address bits. Maybe I can update all of the PRG and CHR address bits at any of those change interrupts and clear all the rest of the change interrupts from multiple address lines changing? I am thinking that the CPU/PPU address bits don't necessarily settle all at the same time, but by the time I get into the interrupt, there may have been enough delay for it to be settled.

Question: Do CPU and PPU address bits change at approx. the same time, or do these need to be handled separately? I am not actually sure that I can clear all other change interrupts like that once triggered or what the overhead clearing them would be. In which case, consideration for method 2:

Method 2:
Is there an edge of M2 that the mapper microcontroller could interrupt on and update PRG and CHR every time? I am wondering if there is 1 spot relative to M2 where CPU and PPU address lines are settled and can be updated all at once to the PRG and CHR address bits.


The method of memory mapping we are used to, with CPU and PPU address bits propagating asynchronously to PRG and CHR address bits is great; nothing has to line up or wait to settle. With a microcontroller it becomes really important to do the minimum number of updates so it is kind of a different problem we don't typically have to deal with. I would love to dig into this and do some experiments but would like some advice from you guys and any shortcuts or pitfalls you might see coming.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Brainstorming using Microcontroller as mapper

Post by tepples »

Ben Boldt wrote: Fri Oct 09, 2020 9:36 am Question: Do CPU and PPU address bits change at approx. the same time, or do these need to be handled separately?
The two buses change separately.
Ben Boldt wrote: Fri Oct 09, 2020 9:36 am Is there an edge of M2 that the mapper microcontroller could interrupt on and update PRG and CHR every time?
On the CPU side:
M2 rise: Address bus is valid except /ROMSEL
30 ns or so after M2 rise: Address bus is valid
M2 fall: Data bus is supposed to be valid
M2 stops oscillating at 1.79 MHz: Reset Button was pressed on the Control Deck

On the PPU side, PPU /RD and PPU /WR are the analogous signals.
PPU /RD fall: Address bus is valid
PPU /RD rise: Data bus is supposed to be valid
User avatar
Quietust
Posts: 1918
Joined: Sun Sep 19, 2004 10:59 pm
Contact:

Re: Brainstorming using Microcontroller as mapper

Post by Quietust »

Ben Boldt wrote: Fri Oct 09, 2020 9:36 am In modern times, fast 32-bit microcontrollers with 5V-tolerant I/O have become very inexpensive and thus appealing.
...
This would be happening on the order of MHz which is asking a lot of a microcontroller, but maybe attainable these days with 100+ MHz micros becoming common.
Are there any 5V-tolerant microcontrollers capable of running at 100+MHz speeds? It's been my understanding that speeds that high are only really feasible when you run at lower voltages.
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.
lidnariq
Posts: 11430
Joined: Sun Apr 13, 2008 11:12 am

Re: Brainstorming using Microcontroller as mapper

Post by lidnariq »

Ben Boldt wrote: Fri Oct 09, 2020 9:36 am This would be happening on the order of MHz which is asking a lot of a microcontroller, but maybe attainable these days with 100+ MHz micros becoming common.
So: Rule of thumb. A 70MHz ARM ("Harmony Cart") emulates a 32KB ROM at 1.2MHz for the Atari 2600.
A 180MHz ARM ("SNES Drone") emulates a 512KB ROM at 2.7MHz (maybe? probably not 3.6MHz) for the SNES.
It should definitely be possible to use two microcontrollers, one for each bus, but that may not be worthwhile.
Change notification interrupts on all the high order CPU and PPU address bits. Maybe I can update all of the PRG and CHR address bits at any of those change interrupts and clear all the rest of the change interrupts from multiple address lines changing?
My understanding is that these more complex microcontrollers take forever (microseconds?) to enter/exit supervisor state for interrupt handling. You may need to do some nonstandard things to get a fast enough turnaround.

Ironically, one of the very fast simple micros (Scenix PICs, 100MHz 8051s) might have an easier time.
Question: Do CPU and PPU address bits change at approx. the same time, or do these need to be handled separately?
Not even a little bit. On the US NES, there are four different alignments of CPU and PPU behavior, and the exact relative phase of each will affect your deadlines by some multiple of 46ns. On PAL famiclones, there's five alignments (multiples of 38ns), and on licensed PAL NESes the two systems are run on relatively prime dividers so you have to deal with all possible alignments all the time.
Is there an edge of M2 that the mapper microcontroller could interrupt on and update PRG and CHR every time?
No, for the same reason. Plus, there are three PPU fetches for every two CPU fetches, so M2 isn't fast enough anyway.
I would love to dig into this and do some experiments but would like some advice from you guys and any shortcuts or pitfalls you might see coming.
Some micros have built-in programmable logic. I've used Microchip's CLCs, and I remember other vendors also having something similar.

It should definitely be possible to use a PIC with CLCs and PPS to get 16+16 PRG banking and 4+4 CHR banking - basically MMC1 class shouldn't be a problem. But nothing compatible with existing mappers.
User avatar
Ben Boldt
Posts: 1148
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Brainstorming using Microcontroller as mapper

Post by Ben Boldt »

Thanks for the info lidnariq, lots of very useful info I never thought about (as usual).
Quietust wrote: Fri Oct 09, 2020 12:02 pm Are there any 5V-tolerant microcontrollers capable of running at 100+MHz speeds? It's been my understanding that speeds that high are only really feasible when you run at lower voltages.
The micros I use at work are STM32F302 (64 MHz) and STM32G474 (170 MHz). These are 3.3V devices but most pins are 5V tolerant, and even have 3 definable I/O speeds. (This setting affects the rise/fall times. Pretty sure the tradeoff is faster = more power used.) I never measured how long it takes to get in and out of an interrupt on either of these! I think a good experiment would be to have an output pin follow an input pin via change notification interrupt and hook that to a function generator and scope. I have a '302 on a breakout board already and running 64MHz, I can try it on that one some time. The '474 is a superior micro but easier to get ahold of the '302's from older generation products in the recycle bin. ;)
plainsteve
Posts: 27
Joined: Tue Jan 23, 2018 11:19 pm

Re: Brainstorming using Microcontroller as mapper

Post by plainsteve »

About 2.5 years ago I tried using a dsPIC33EV (5-volt, 16-bit, 70MHz, pipelined, RISC) to emulate CHR rom. I had high hopes.

As lidnariq mentioned, interrupt response time was bad, for this chip on the order of 10 cycles! I found I needed to use the dsPIC's loop hardware to get the most performace. (The loop hardware essentially eliminates the loop overhead associated with branching.)

PRGROM probably doable as the requirements are 1/3, IIRC. CHR would require more fasterer MHz. Emulating RAM will require more time to check more bits, set your data port as read/write, etc.

I also tried a Propeller. It could not keep up (in my experiment, YMMV).

Keep in mind if you use level translation you will lose some of your time window to propagation through the translator. I used 74LVC245s (edit: for my 3.3V projects).

I thought about using PIC32s but FPGAs become more attractive to me at that tier.
User avatar
aquasnake
Posts: 515
Joined: Fri Sep 13, 2019 11:22 pm

Re: Brainstorming using Microcontroller as mapper

Post by aquasnake »

I think MCU is only suitable for mapper 1 and mapper 3



Mapper1, in serial mode, latch 1 bit once, and then set the bank when writing mapper at the sixth fifth(thanks for Quietust's correction below) time. Since the sixth fifth write action does not actually latch the data, the timing requirements are not so strict.



Mapper3, because each time the Chr bank is fetched ONLY, accessing the Chr ROM through the CPU address is at least in the next cycle. So it should work.



Mapper2 and its variants need to write the mapper register to change the Prg bank immediately. In this case, it may not be reliable for MCU to implement the mapper.
Last edited by aquasnake on Sun Oct 11, 2020 6:46 pm, edited 1 time in total.
User avatar
Quietust
Posts: 1918
Joined: Sun Sep 19, 2004 10:59 pm
Contact:

Re: Brainstorming using Microcontroller as mapper

Post by Quietust »

aquasnake wrote: Sun Oct 11, 2020 5:59 am Mapper1, in serial mode, latch 1 bit once, and then set the bank when writing mapper at the sixth time. Since the sixth write action does not actually latch the data, the timing requirements are not so strict.
But the MMC1 doesn't set the bank during a sixth write - it does it during the fifth write (i.e. at the same time as it receives the final bit)...
Quietust, QMT Productions
P.S. If you don't get this note, let me know and I'll write you another.
User avatar
Ben Boldt
Posts: 1148
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Brainstorming using Microcontroller as mapper

Post by Ben Boldt »

I measured how long it would reasonably take to have a change notification interrupt and change an output port and it is unfortunately a very long 470nsec. Give it another 10 years I guess.

tek00060.png
Trirosmos
Posts: 50
Joined: Mon Aug 01, 2016 4:01 am
Location: Brinstar, Zebes
Contact:

Re: Brainstorming using Microcontroller as mapper

Post by Trirosmos »

Ben Boldt wrote: Wed Oct 28, 2020 2:58 pm I measured how long it would reasonably take to have a change notification interrupt and change an output port and it is unfortunately a very long 470nsec. Give it another 10 years I guess.
Is this just bit-banging the GPIO port? Is it executing from flash or from TCM?

What if instead of using interrupts the CPU ran a tight unrolled loop that updates a buffer that is then DMA'ed to the GPIO?

Edit: also, there are STM32 devices much faster than the F3/F4 ones. Those might be interesting to play around with...
User avatar
Ben Boldt
Posts: 1148
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Brainstorming using Microcontroller as mapper

Post by Ben Boldt »

I did try a bit-bang loop as an experiment and it was considerably faster. I didn't measure it but I am thinking it would be fast enough. The problem is that if you bit-bang, the micro can't go off and do anything else, such as generate expansion audio.

I really spent some time thinking of how to use DMA for this, because these micros do have really good DMA abilities. The problem is that there are multiple PRG-ROM banks. You have to use the high-order CPU address bits to decide which bank you are in; i.e. which register to push out to the extended PRG bits. With DMA, you don't have the ability to put a decision like that in there.

My plan was to use CPU A12,13,14 hooked to a change notification interrupt. So any of the 3 bits changing would trigger the interrupt. Then in the interrupt, use those 3 bits as an index into a table of PRG banks, and spit that value directly out to a port as PRG A12...A19. (Writing to the mapper's registers elsewhere would be able to change the values in the table.) Then at the end of the interrupt, clear all 3 interrupts. Expanding on that for something similar going on as well for the PPU, mirroring, etc.

I have some G4 micros laying around too, and from what I gather they can be roughly twice as fast at doing this, which still is not anywhere near fast enough considering that these delays stack up with the delays from ROM chip being controlled itself.
Trirosmos
Posts: 50
Joined: Mon Aug 01, 2016 4:01 am
Location: Brinstar, Zebes
Contact:

Re: Brainstorming using Microcontroller as mapper

Post by Trirosmos »

Ben Boldt wrote: Tue Nov 17, 2020 9:51 am The problem is that if you bit-bang, the micro can't go off and do anything else, such as generate expansion audio.
Just gotta cycle count everything :P
Ben Boldt wrote: Tue Nov 17, 2020 9:51 am With DMA, you don't have the ability to put a decision like that in there.
Hmm... the MDMA on the STM32H7 micros is theoretically turing-complete, I suppose?
User avatar
Ben Boldt
Posts: 1148
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Brainstorming using Microcontroller as mapper

Post by Ben Boldt »

Thecoolestnerdguy wrote: Tue Nov 17, 2020 2:47 pm Hmm... the MDMA on the STM32H7 micros is theoretically turing-complete, I suppose?
I have never used any DMA features of these chips before, maybe there is more to it than I realize?? I wasn't sure on how the speed of DMA based on pin change notification would compare with an interrupt. I guess theoretically there wouldn't be any stack operations? I think basically all of the latency comes from stack stuff when using interrupts.

I kind of misspoke earlier; since there are only 4 ROM banks, that should just be 2 CPU address bits, probably A14 and A13. If one was to use a 2-to-4 decoder and feed that to 4 change notification pins, you could set the change notification to rising edge only and handle all 4 of those with separate completely brainless DMAs, each dedicated to 1 ROM bank. I think we run out of resources pretty quick when trying to do something similar for the PPU though, as that has 8 banks. More discrete logic chips yet might be necessary to get configurable mirroring working right. The whole idea was to do everything with 1 cheap chip and that quickly fades as we pack in more digital logic chips...
Trirosmos
Posts: 50
Joined: Mon Aug 01, 2016 4:01 am
Location: Brinstar, Zebes
Contact:

Re: Brainstorming using Microcontroller as mapper

Post by Trirosmos »

Ben Boldt wrote: Tue Nov 17, 2020 7:30 pm I have never used any DMA features of these chips before, maybe there is more to it than I realize??
As I understand it, the " normal " DMAs are just your typical get a trigger, transfer some bytes type of thing. I think they can do some kinds of unpacking and do different increments at the source and destination, but that's about it.

The HDMA on the STM32H7 micros, however, is a lot more powerful.
Ben Boldt wrote: Tue Nov 17, 2020 7:30 pm I guess theoretically there wouldn't be any stack operations? I think basically all of the latency comes from stack stuff when using interrupts.
Yeah, that and waiting for the flash memory to respond as well as having to go through different bus bridges to access GPIO.
User avatar
Ben Boldt
Posts: 1148
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: Brainstorming using Microcontroller as mapper

Post by Ben Boldt »

I may have to procure a demo board of said H7 series ST micro... ;) Heck, maybe I have one, I should dig around first.
Post Reply