It is currently Thu Oct 19, 2017 12:18 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Gameboy FPGA Cartridge
PostPosted: Sat Oct 15, 2016 7:15 am 
Offline

Joined: Fri Oct 16, 2015 1:14 pm
Posts: 10
Hi,

this is a project I have been working on in my spare time for a while now.
The plan is to build a cart that can run all MBC1-5 games including RTC support and learn something about PCB layout, SMD soldering and FPGA programming. This is not really my first experience in that direction, but the first 'serious' project that puts everything together to do something useful.

The current design includes a MachXO2 FPGA, 16MiB SDRAM, 5V↔3.3V level shifters, RTC+battery, and a MicroSD socket. This is a compromise between fitting everything in the limited space, keeping it solderable by hand, and still including all planned features.

The layout and assembly of the first prototype is done and I am quite happy with the design as I have not (yet) encountered any major bugs.

Partially assembled prototype:
Image

Fully populated and cleaned PCB with connectors for JTAG, I2C, serial and SPI:
Image

And (to my surprise) it actually works: (running a small test from the FPGA embedded memory)
Image

What works:
-GB bus interface with read and write cycles
-Running code from the internal ~8KiB BRAM
-Accessing the RTC
-Low level (block level) SD card interface (some cards still fail initialisation)
-Simple SDRAM controller


After resolving many small and some bigger issues, for example figuring out the quirks of the GB bus cycle, SD card initialisation, or the weird I2C controller in the FPGA,
everything needed to run 32KiB ROM-only/MBC1 games is working. The gameboy boots from the FPGA BRAM, copies 64 sectors (64*512byte) to the SDRAM, switches the lower 32KiB from BRAM to SDRAM and then jumps to the rom at 0x0100:
Image

While this is the bare minimum needed to run small games, there is still quite some missing stuff to make the thing useful:
-MBC emulation
-High level SD card interface (FAT32)
-Savegames
-DMA for fast memory transfers (using the GB cpu, shuffling the Tetris rom byte-by-byte from the card to ram takes ~10s, large games would take minutes...)
-Proper bootloader
-All the important things I completely forgot about

MBC support (without RTC) should be straightforward and would run most games, so that will be next. Stay tuned.


Top
 Profile  
 
PostPosted: Sat Oct 15, 2016 12:37 pm 
Offline

Joined: Fri Oct 16, 2015 6:18 am
Posts: 39
Pretty neat! I know from experience that it's not easy to fit much functionality on a small Game Boy cart-sized board. I'm a bit jealous of the MachXO2 because my experiments are still using ispMACH 4000 and the crappy ispLEVER software :/
Some thoughts (sorry if I'm pointing out very obvious things):

It's difficult to see from the pictures, but do you have unused inputs on your level shifters? You probably know this, but CMOS inputs should in general not be left floating.

Quote:
"DMA for fast memory transfers (using the GB cpu)"

On pre-CGB devices the only DMA available is OAM DMA which will not help you. Can't you move bytes around with the FPGA faster than the GB CPU? If your level shifters are 3-state, you can isolate the GB from the FPGA/SDRAM and just move things around as fast as possible.

Speaking of 3-stating the level shifters, you can also reset the GB by pulling RST low. The Game Boy includes a pull-up resistor, so personally I've simply used a 5V-tolerant open-collector buffer to handle this. I can't vouch 100% that it won't destroy your Game Boy, but I haven't had any problems. My method is simply this: RST low, 3-state all GB signals, write memories, re-enable GB signals, RST high.

Are your level shifters dual supply or just 5V-tolerant? I'm pretty sure a 5V-tolerant buffer in the data bus wouldn't be 100% reliable because it can't reliably drive highs in the full 5V CMOS range used by the Game Boy. In practice this might not be a problem, but random issues are very annoying and hard to debug.

I see that CLK/PHI pin from the Game Boy is connected. Are you aware that it is stopped sometimes (e.g. HALT mode, STOP mode)?


Top
 Profile  
 
PostPosted: Sat Oct 15, 2016 3:00 pm 
Offline

Joined: Fri Oct 16, 2015 1:14 pm
Posts: 10
I implemented MBC5 emulation and optimised the code enough to load Pokemon Blue in ~20s instead of >5 minutes. It plays the intro and then (of course, why would I have expected something else) crashes.
Whelp, I probably messed up the bank switching somewhere. I should check what the game does in the emulator...


Quote:
I'm a bit jealous of the MachXO2 because my experiments are still using ispMACH 4000 and the crappy ispLEVER software :/

Hehe, I probably checked every FPGA family I could find and the MachXO2 was the only one that really fits the bill (small package with enough I/O-pins, enough logic, single supply, embedded config memory and enough embedded memory to boot from, PLL... you get the idea). Of course not everything on that list is really mandatory, but the smaller (feature-wise) chips I found did not really seem to be worth the trouble.
My design currently uses about 25% of the FPGA and the MACH 4000 looks relly tiny in comparison. How much did you manage to fit in the CPLD?

Quote:
It's difficult to see from the pictures, but do you have unused inputs on your level shifters? You probably know this, but CMOS inputs should in general not be left floating.

That is true. I do have 4 unused and unconnected inputs on one level shifter (74LVT16245B) which is fine according to the datasheet because "Bus hold data inputs eliminate need for external pull-up resistors to hold unused inputs", but they are directly adjacent to a GND trace so connecting them would be trivial.

Quote:
On pre-CGB devices the only DMA available is OAM DMA which will not help you. Can't you move bytes around with the FPGA faster than the GB CPU? If your level shifters are 3-state, you can isolate the GB from the FPGA/SDRAM and just move things around as fast as possible.

That was probably a bit unclear. By DMA I mean moving around the data without involving the GB CPU. The SDRAM is sitting on its own isolated bus anyway and is idle most of the time, so I just wanted to squeeze some cycles in there. The GB should have absolute priority, but the CLK pin makes it easy to anticipate the next access.

I will need something like this anyway to move savegames to permanent storage, as ROM + Save RAM are stored in the SDRAM and battery backing that is not feasible. I'm not too sure how reliable this is if the GB can be turned off at any moment, but I'm not there yet anyway.

Quote:
Speaking of 3-stating the level shifters, you can also reset the GB by pulling RST low.

There is an open-collector (or rather open-drain) transistor on the bottom right connected to RST. Resetting the GB works well, but I don't really need it at the moment.

Quote:
Are your level shifters dual supply or just 5V-tolerant? I'm pretty sure a 5V-tolerant buffer in the data bus wouldn't be 100% reliable because it can't reliably drive highs in the full 5V CMOS range used by the Game Boy. In practice this might not be a problem, but random issues are very annoying and hard to debug.

The shifters are single supply 74LVT16245B. The 3.3V are indeed slightly too low for proper CMOS high levels, but I haven't encountered any issues so far. At a glance, the 74ALVC164245 could be an option. Do you have suggestions?

Quote:
I see that CLK/PHI pin from the Game Boy is connected. Are you aware that it is stopped sometimes (e.g. HALT mode, STOP mode)?

Yes. Everything runs off the 40MHz oscillator next to the MicroSD socket. Some signals are unused but connected in case I need them later and because there were still unused FPGA pins. Now I kind of wished I had added more than one LED instead of connecting all 4 SDIO data pins :roll: .


Top
 Profile  
 
PostPosted: Sat Oct 15, 2016 7:43 pm 
Offline

Joined: Tue Nov 23, 2004 9:35 pm
Posts: 615
This looks like a potential EverDrive GB competitor. The RTC is a feature the EDGB does not have. The EDGB writes to flash, not to SDRAM, when loading a game. Writing to SDRAM should be much faster than flash, especially when you take the GBC's double-speed and DMA modes into effect. It should also consume less power because the chip does not need to flash itself.

To write a 4MB game (the largest game that will work with a DMG) on a DMG takes 75 seconds, including the time it takes to erase the flash (about 15 seconds). Can you improve on that even after removing the time it takes to erase the flash with SDRAM?

There are lots of 4MB Game Boy Color games, but only one 8MB game, Densha de Go! 2.

Pokemon Blue uses MBC3.

_________________
Nerdly Pleasures - My Vintage Video Game & Computing Blog


Last edited by Great Hierophant on Sun Oct 16, 2016 7:15 am, edited 2 times in total.

Top
 Profile  
 
PostPosted: Sun Oct 16, 2016 12:59 am 
Offline

Joined: Fri Oct 16, 2015 6:18 am
Posts: 39
Quote:
My design currently uses about 25% of the FPGA and the MACH 4000 looks relly tiny in comparison. How much did you manage to fit in the CPLD?

I'm actually using the CPLD just as a research tool and usually as a massive shift register, so the features or the size haven't been a problem. In my GB cart design I use a PIC microcontroller because the cart is for a special purpose that doesn't need strict timing requirements.

Quote:
That is true. I do have 4 unused and unconnected inputs on one level shifter (74LVT16245B) which is fine according to the datasheet because "Bus hold data inputs eliminate need for external pull-up resistors to hold unused inputs", but they are directly adjacent to a GND trace so connecting them would be trivial.


Yes, bus hold removes this problem, but note that it's only in the parts with a H in the name after the logic family (e.g. 74LVTH16245B). Overall it's not a big problem, but will affect noise and power consumption. I recently did a Game Boy LCD bivertion test involving different inverter chips, and a simple 6-bit inverter required milliamps when only 2 bits were connected and 4 bits were floating, while the same chip with the unused 4 bits connected to GND only required microamps. The supply current was around x100 compared to when 4 out of 6 inputs were floating. This opened my eyes to the issue of floating CMOS inputs, and I've tried to be more careful in my own designs since then.

Quote:
The shifters are single supply 74LVT16245B. The 3.3V are indeed slightly too low for proper CMOS high levels, but I haven't encountered any issues so far. At a glance, the 74ALVC164245 could be an option. Do you have suggestions?

My design uses 74ALVC164245 and 74LVC4245A, although I recently changed the 16-bit buffer into a single-supply one, because it's used in the address bus and in my case it's not bidirectional. The data bus has the only output pins in my design, so it still has the dual supply chip. In your case 74ALVC164245 could be fine, although AFAIK a bus hold version of that chip doesn't exist in NXP's product lines. At a quick glance TI might have 16-bit dual supply chips with bus holds, so you could take a look at their products as well. Price is a bit higher (I originally chose NXP because of the price :P), but the difference might not be that big.

BTW, do you have any plans to open source the design? And what's the smallest pitch? If you decide to open source at some point, I'd love to build one myself from the parts.


Top
 Profile  
 
PostPosted: Sun Oct 16, 2016 2:33 pm 
Offline

Joined: Fri Oct 16, 2015 1:14 pm
Posts: 10
Pokemon still crashes after the intro, but the emulator doesn't show anything interesting at that point. It switches rom banks all the time without issues and I verified the mapping by dumping the game after it was loaded. Why does it always have to be weird bugs like that... :?


Quote:
To write a 4MB game (the largest game that will work with a DMG) on a DMG takes 75 seconds, including the time it takes to erase the flash (about 15 seconds). Can you improve on that even after removing the time it takes to erase the flash with SDRAM?

Copying 4MB with the GB CPU in single speed mode from microSD to SDRAM should take about 80 seconds, but the FPGA should be able to do it in a few seconds, depending on the SD card clock and how much of the transfer is managed in hw.

Quote:
Pokemon Blue uses MBC3.

All(?) european versions of Red/Blue use MBC5.

Quote:
In my GB cart design I use a PIC microcontroller because the cart is for a special purpose that doesn't need strict timing requirements.

Looks interesting. What are the specs? Do you use 5V chips or are there two level shifters at the bottom and no RAM?
The first idea for my cart included a STM32 uC and USB, but I dropped that idea due to space constraints.

Quote:
I recently changed the 16-bit buffer into a single-supply one, because it's used in the address bus and in my case it's not bidirectional. The data bus has the only output pins in my design, so it still has the dual supply chip.

Good point. The data bus is also the only bidirectional port in my design, so replacing only one of them should suffice.

Quote:
BTW, do you have any plans to open source the design? And what's the smallest pitch? If you decide to open source at some point, I'd love to build one myself from the parts.

I probably will at some point, but no definite plans so far. First I want to get it working and then maybe a second hw revision. It also depends on my free time and motivation...
The smallest pitch is 0.5mm for the FPGA and level shifters and the smallest parts are 0603.


Top
 Profile  
 
PostPosted: Sun Oct 16, 2016 7:02 pm 
Offline

Joined: Tue Nov 23, 2004 9:35 pm
Posts: 615
You are right, all non-English European versions of Red & Blue use MBC5, all US (and other English speaking countries) versions of Red and Blue and all Japanese versions of Red, Green, Blue and Yellow use MBC3.

Good luck with getting the FPGA to transfer the ROM to the SDRAM and your bankswitching. Maybe you could implement Wisdom Tree's mapper!

_________________
Nerdly Pleasures - My Vintage Video Game & Computing Blog


Top
 Profile  
 
PostPosted: Tue Oct 18, 2016 4:05 pm 
Offline

Joined: Fri Oct 16, 2015 1:14 pm
Posts: 10
I narrowed the crashes down to writes to the save RAM, but still have no idea what goes wrong. ROM and RAM access is virtually identical and everything works during simulation, so something weird is happening or I have a very stupid bug...
Disabling SRAM writes 'fixes' the problem and has some interesting effects on some games :) .

I also managed to get GBC games running without issues (except for the SRAM stuff...). I kind of expected that my memory interface would be too slow for double speed mode, as one read access takes 170ns best case and up to 60ns more if it overlaps with a refresh cycle.
Now the question is what the bus timing in single and double speed mode looks like and when does the GB sample the bus. For single speed mode I used these timings, but the sampling on the falling CLK edge is only a guess. Maybe I should hook up my logic analyzer and increase the delay until I hit that point, but apparently the bus is sampled later in the cycle (at least in double speed mode) or my memory controller would not work. Maybe it is the last rising edge of the 4MHz/8MHz clock in a cycle?


Top
 Profile  
 
PostPosted: Tue Oct 18, 2016 4:17 pm 
Offline

Joined: Mon Jul 02, 2012 7:46 am
Posts: 759
Rüdiger wrote:
Disabling SRAM writes 'fixes' the problem and has some interesting effects on some games :) .


IIRC, SRAM is supposed to be write-disabled on the MBC5 at boot, and only enabled by writing 0x0A to 0000-1FFF. That might do it.


Top
 Profile  
 
PostPosted: Wed Oct 19, 2016 1:13 am 
Offline

Joined: Fri Oct 16, 2015 6:18 am
Posts: 39
Quote:
Looks interesting. What are the specs? Do you use 5V chips or are there two level shifters at the bottom and no RAM?


It's basically just a 3.3V 32Kx8 SRAM + level shifters. USB provides the power (with regulation 5V -> 3.3V), so the Game Boy 5V is only used for a LED. The PIC controller provides USB connectivity, isolates the SRAM for writing, and resets the Game Boy. In my use case USB is always connected so it can be used for power, volatility is ok so SRAM instead of flash is cheaper and simpler, and MBC support is not needed so no banking or big memory is needed.

Quote:
For single speed mode I used these timings, but the sampling on the falling CLK edge is only a guess


You might be interested in my timing charts as well:

Read: http://gekkio.fi/files/mooneye-gb/night ... timing.png
Write: http://gekkio.fi/files/mooneye-gb/night ... timing.png

Note that these timings are from an SGB2, and later models apparently have slightly different timings. The main structure is the same, but I think I remember someone saying that AGB for example doesn't pulse the chip select signals (A15, CS/MREQ) on each cycle.

I also have 95% sure confirmation that Game Boy samples the data bus on rising CLK (in these charts right at the end). This was tested on an SGB unit.


Top
 Profile  
 
PostPosted: Sat Oct 22, 2016 5:29 pm 
Offline

Joined: Fri Oct 16, 2015 1:14 pm
Posts: 10
As it turns out, the problem was caused by a stupid mistake in the address handling. Instead of mapping the ROM to 0x000000 and the SRAM to 0x800000, the highest bit got lost, causing all writes to SRAM to overwrite parts of the ROM instead... :oops:
I spend some time cleaning up and optimising the HDL code. The design runs at 80MHz, and depending on the phase of the moon the synthesis result is reported to be too slow and indeed does not run properly. It should be possible to get the max clock to maybe 100MHz, but using higher speed grade chips from the beginning would have been nice :roll: .

Quote:
Note that these timings are from an SGB2, and later models apparently have slightly different timings. The main structure is the same, but I think I remember someone saying that AGB for example doesn't pulse the chip select signals (A15, CS/MREQ) on each cycle.

I use a falling edge on CS/A15 to detect the start of a memory access, so they seem to be high at the start of each cycle.

Quote:
I also have 95% sure confirmation that Game Boy samples the data bus on rising CLK (in these charts right at the end). This was tested on an SGB unit.

Nice, that should leave plenty of room even in double speed mode.


Top
 Profile  
 
PostPosted: Wed Nov 30, 2016 10:57 am 
Offline

Joined: Fri Oct 16, 2015 1:14 pm
Posts: 10
Quick update:
I added persistent saves a while back and played halfway through Link's Awakening without issues. Pushing a button on the cartridge resets the gameboy and the loader then writes the save back to the SD card.

The cartridge now runs the memory at 100MHz and everything else at 50MHz to alleviate the timing issues. Some changes still cause weird errors so I assembled a second cartridge with a faster FPGA. This actually seems to help, but I don't trust it until all missing features are implemented.
I tested some more games and while all of them run in principle, some end up with garbled graphics because HDMA transfers from the cartridge don't work. I suspect that the HDMA bus cycles are slightly different, but I haven't confirmed it yet. And the SGB doesn't even boot from the cartridge :cry:.


Top
 Profile  
 
PostPosted: Thu Feb 23, 2017 3:05 pm 
Offline

Joined: Fri Oct 16, 2015 1:14 pm
Posts: 10
The project isn't dead, I just have been busy with work, but I still made some progress.

I have completely reorganized the firmware and rewritten several parts of it. Instead of one giant state machine with 20+ states which handles everything, the blocks are now connected to a wishbone bus and the GB interface just maps part of the wishbone address space to the gameboy address space. This has made the whole thing much more modular and got rid of some architectural issues. The faster FPGA apparently also helped and I haven't encountered any more strange stability issues after changing unrelated parts of the code. The WB bus has an arbiter to connect more than one master to the bus. At the moment, this is the GB interface and the DMA controller, but it is simple to expand in the future (external µC, debug port, ...).

A new DMA controller now shovels bytes around without involving the GB CPU. Reading from the SD card now reaches ~320KiB/s including file system overhead, which is still done on the CPU. I was hoping for something in the 0.5-1 MiB/s range, but it's still several times faster than using the GB CPU.

I have rewritten the bootloader and now use C instead of Asm+Forth. This increased the binary size, but I managed to squeeze everything into the 8KiB ROM. The bootloader also gained FAT32 support and supports the bare minimum for loading a file from the SD card and executing it. This is used to load a 'proper' loader which then selects the game.

Unfortunately, there are still some issues: some games crash in completely reproducible ways, so I suspect that there is still a bug in the MBC emulation (not present in the old fw). Switching to double speed mode crashes. The problem seems to be in the GB interface, but that part is virtually identical to the old firmware, which has no such issue. And some consoles still don't work. For the last two issues, I will probably have to write a new GB interface and look at the waveforms on the bus to figure out what is going wrong. I would already have done it, but my logic analyzer doesn't have nearly enough channels... :roll:


Top
 Profile  
 
PostPosted: Mon Aug 28, 2017 11:51 am 
Offline

Joined: Fri Oct 16, 2015 1:14 pm
Posts: 10
Finally finished the improved r2 layout and assembled the first board.
The new layout adds a microcontroller, USB, switching regulator, RGB status led, and many small improvements.

The biggest change is the microcontroller, which does all the heavy lifting instead of the GB CPU.
The microcontroller and FPGA can be programmed via USB, so no more fiddling with the JTAG adapter and
programming the FPGA now only takes 3s instead of >30.

I'm now adapting the FPGA firmware to the new hardware. Many of the firmware blocks are now obsolete, making
the firmware more compact. But I still have to write a new interface between uC and FPGA to load files into memory.

Image


Top
 Profile  
 
PostPosted: Mon Aug 28, 2017 1:53 pm 
Offline
User avatar

Joined: Mon Apr 04, 2011 11:49 am
Posts: 1905
Location: WhereverIparkIt, USA
Looking good!

Out of curiosity, what mcu did you go with?

_________________
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 17 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group