Verilog MBC5

Discussion of programming and development for the original Game Boy and Game Boy Color.
Post Reply
User avatar
NovaSquirrel
Posts: 483
Joined: Fri Feb 27, 2009 2:35 pm
Location: Fort Wayne, Indiana
Contact:

Verilog MBC5

Post by NovaSquirrel »

Code: Select all

module MBC5
  (
    input not_reset,
    input not_cs,
    input not_wr,
    input [7:0] data,          // data bus value
    input [3:0] address,       // top 4 bits of address
    output [8:0] out_rom_bank, // 9-bit ROM bank to use
    output reg [3:0] ram_bank, // 4-bit RAM bank to use
    output out_ram_enable      // active low RAM enable
  );
  reg [8:0] rom_bank;        // currently selected ROM bank
  reg ram_enable;            // is RAM enabled?

  always@(posedge not_wr or negedge not_reset) begin // may need negedge instead?
    if (!not_reset) begin
      ram_enable <= 0;
      rom_bank <= 1;
      ram_bank <= 0;
    end
    else if(!not_wr) begin
      case(address) // select based on top 4 bits of address
        0, 1:
          ram_enable <= data == 8'h0a;
        2:
          rom_bank[7:0] <= data;
        3:
          rom_bank[8] <= data[0];
        4, 5:
          ram_bank <= data[3:0];
        // anything 6 or above ignored
        endcase
     end
  end

  // select fixed bank or switchable bank
  assign out_rom_bank = address[2] ? rom_bank : 0;
  // select external RAM at 0xA000 - 0xBFFF
  assign out_ram_enable = !(ram_enable && address[3:1] == 3'b101);
endmodule
I don't know if the lack (is there one? Tepples keeps mentioning it) of >32KB Game Boy boards is because of a lack of open MBC implementations or some other reasons, but I really doubt it's a cost problem because the parts to implement a mapper are so cheap. In any case, here's a (public domain) MBC5 I wrote, that works in simulation at least. I'm not entirely confident in the clocking being correct, or what the reset values should be, but it should hopefully only require minor changes. It easily fits in an Xilinx XC9536XL, using 24 of the 36 macrocells.
Last edited by NovaSquirrel on Tue Jun 19, 2018 1:27 pm, edited 1 time in total.
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: Verilog MBC5

Post by lidnariq »

Of all the MBCs, even the MBC5 is really quite simple; it's almost something you can build in discrete 74xx logic (QFN parts) and fit inside a DMG shell.

My hunch is what is holding back DMG carts is
1- People like DIP; DIP is mostly a nonstarter to fit inside existing DMG shells.
2- People mostly don't have nostalgia for the original DMG games (which were mostly 5V-friendly 512 KiB and smaller) but instead for the larger GBC carts, and then you have the problem of fitting a 3V ROM and logic translation inside a shell
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Verilog MBC5

Post by tepples »

Catskull's GB flash cart is 32K. That's not quite big enough for even a game of similar scope to the first Super Mario Land. There's plenty of room for homebrew to grow from 64K to 512K as the scene grows. And some GBC-era games are still 512K or smaller: Elmo in Grouchland, for instance, is 256K of game plus 768K of irrelevant padding.

The equivalent of UNROM/UOROM would be about two chips: a latch and a quad OR. This would allow up to 256K. And the presence of separate read and write signals on the GB cart edge means bus conflicts might be easier to avoid, allowing the cart to present a port at $2000 that allows use of a ROM on both an MBC (for emulators) and the discrete hardware. The MBC would decode the port at only $2000-$3FFF or $2000-$2FFF, with the rest devoted mostly to SRAM selection, while the simplified hardware decodes the port across all of $0000-$7FFF.
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: Verilog MBC5

Post by lidnariq »

tepples wrote:The MBC would decode the port at only $2000-$3FFF or $2000-$2FFF, with the rest devoted mostly to SRAM selection, while the simplified hardware decodes the port across all of $0000-$7FFF.
That really doesn't actually save you much: just decoding $2000-$2FFF vs all of $0000-$7FFF is 4 ICs vs 3 ICs.
Limit the size to 256 KiB and that only takes 2 ICs.
Drop standard MBC 16F+16 banking and you can get down to 1 IC and back up to 512 KiB. (And as memblers pointed out with GTROM, one can emulate 16F+16 at programming time given a double-size ROM anyway).

I actually already have an eagle schematic/board for the four 74xx version. Fits in the standard cheap (5cm)² bulk-orderable PCBs. Never bothered to have it made because the routing was ugly.... and now you can't buy M29F160s anymore.

(edit) Gekkio says that in the original DMG, /CS arrives ½ master cycle after A14, so I should see if that's true in GBC/GBA also. If so, I could get away with just a 74'161 for a super-cheap self-flashable option...
qwertymodo
Posts: 775
Joined: Mon Jul 02, 2012 7:46 am

Re: Verilog MBC5

Post by qwertymodo »

A couple of issues I see from my own testing, the ROM bank should be set to 1 on reset, not 0, and the ram enable decoding isn't quite right. You forgot to include the not_cs input, and the decoding actually only cares that A14 is low (rather than A15:13 = 101).

Code: Select all

assign out_ram_enable = cs || !ram_enable || address[2];
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: Verilog MBC5

Post by lidnariq »

qwertymodo wrote:the decoding actually only cares that A14 is low (rather than A15:13 = 101).
Per what Gekkio said, /CS is asserted for the memory range from $A000-$FDFF. On the DMG, it's specifically asserted after the first half master cycle, for the remaining 3.5 master cycles.

Other that timing difference, at least on DMG, there's no functional difference between (/CS=0 AND A14=0) and (A[15..13]='b'101).

Obviously Gekkio's data only concerns the DMG; the GBC and GBA's timing and ranges need to be compared.

... also, those linked timing graphs are subtly different from their graphs in their Complete Technical Reference.
gekkio
Posts: 49
Joined: Fri Oct 16, 2015 6:18 am

Re: Verilog MBC5

Post by gekkio »

lidnariq wrote:... also, those linked timing graphs are subtly different from their graphs in their Complete Technical Reference.
Yeah, trust GBCTR in this case ;) The difference is naming (CS vs MREQ) and in GBCTR the graphs continue for one more clock edge to illustrate the fact that some signals are deasserted (or "reset" to default state) after the machine cycle.
Even if the intent is to read a 16-bit value, the CPU deasserts the relevant control signals between those reads, so it really is two 8-bit reads in sequence than a "16-bit read".
For example, if it reads a 16-bit value from $1234, A15 goes high for one clock cycle in the middle.

Note that OAM DMA is different and I don't have graphs for that right now. But IIRC basically it sets all signals on the first rising edge of the first actual OAM DMA machine cycle and further cycles just increment the address without pulsing any control signals.

CS *must be* included in chip select handling for external RAM, because the address bus can have a value with A[15..13] = 101 even when the data bus shouldn't be driven. (+ the default state of RD is low, so you can't really rely on that)
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: Verilog MBC5

Post by lidnariq »

I was going to separately call out this link where calima says that bennvenn said that /CS doesn't pulse on every memory cycle on the GBA. Be nice if I could find an original post to cite instead.

And, of course, Gekkio mentions the same thing in the post I linked to just above.


gekkio wrote:CS *must be* included in chip select handling for external RAM, because the address bus can have a value with A[15..13] = 101 even when the data bus shouldn't be driven. (+ the default state of RD is low, so you can't really rely on that)
... Wait, what? How does that happen?
gekkio wrote:For example, if it reads a 16-bit value from $1234, A15 goes high for one clock cycle in the middle.
... I need to get better at reading.
qwertymodo
Posts: 775
Joined: Mon Jul 02, 2012 7:46 am

Re: Verilog MBC5

Post by qwertymodo »

lidnariq wrote:Other that timing difference, at least on DMG, there's no functional difference between (/CS=0 AND A14=0) and (A[15..13]='b'101).
I'm just telling you what I have found from my MBC5 test bench. A15 and A13 are ignored by the /RAMCS decoder.

Code: Select all

Enabling RAM
Loading address $0000, /CS LOW
SRAM /CS Status: LOW
Loading address $1000, /CS LOW
SRAM /CS Status: LOW
Loading address $2000, /CS LOW
SRAM /CS Status: LOW
Loading address $3000, /CS LOW
SRAM /CS Status: LOW
Loading address $4000, /CS LOW
SRAM /CS Status: HIGH
Loading address $5000, /CS LOW
SRAM /CS Status: HIGH
Loading address $6000, /CS LOW
SRAM /CS Status: HIGH
Loading address $7000, /CS LOW
SRAM /CS Status: HIGH
Loading address $8000, /CS LOW
SRAM /CS Status: LOW
Loading address $9000, /CS LOW
SRAM /CS Status: LOW
Loading address $A000, /CS LOW
SRAM /CS Status: LOW
Loading address $B000, /CS LOW
SRAM /CS Status: LOW
Loading address $C000, /CS LOW
SRAM /CS Status: HIGH
Loading address $D000, /CS LOW
SRAM /CS Status: HIGH
Loading address $E000, /CS LOW
SRAM /CS Status: HIGH
Loading address $F000, /CS LOW
SRAM /CS Status: HIGH
lidnariq wrote:I was going to separately call out this link where calima says that bennvenn said that /CS doesn't pulse on every memory cycle on the GBA. Be nice if I could find an original post to cite instead.
Here's a logic trace of OAM DMA that gekkio sent me that supports this. Normal memory accesses do seem to pulse /CS on each access though.

Image
gekkio
Posts: 49
Joined: Fri Oct 16, 2015 6:18 am

Re: Verilog MBC5

Post by gekkio »

... Wait, what? How does that happen?
There's a couple of possible scenarios, but to be fair, using A[15..13] wouldn't probably lead to any actually bad thing. It's more about principles: there's a chip select signal (CS or A15 depending on the memory area), and when it's high, you're not supposed to drive the bus. On GB there won't be any other device with A[15..13] = 101 that would drive the bus at the same time, but one practical thing about not using CS is that you might end up doing some unnecessary extra switching (-> extra power consumption).

In normal conditions (= normal GB unit), the CPU chip always drives the address bus, so there is a value there even outside memory accesses.
In special conditions (e.g. CPU soldered to my test bench), the address bus is 3-state and things other than addresses may appear there, such as intermediate 16-bit values during some CPU instructions.

So, using A[15..13] is probably fine, but using /CS=0 AND A14=0 is the technically correct way and arguably as easy or even easier than doing the wrong thing.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Verilog MBC5

Post by tepples »

Is failure to take /CS into account the cause of the inc de bug while DE points at OAM ($FE00-$FE9F)? If so, that might inform your design.
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: Verilog MBC5

Post by lidnariq »

qwertymodo wrote:Here's a logic trace of OAM DMA that gekkio sent me that supports this. Normal memory accesses do seem to pulse /CS on each access though.
Ah. Thank you.
gekkio wrote:It's more about principles: there's a chip select signal (CS or A15 depending on the memory area), and when it's high, you're not supposed to drive the bus.
Thinking about your CTR diagrams some more, it's pretty clear that "A15" is really "/LowMemorySelect" and "/CS" is really "/HighExternalMemorySelect". That /LowMemorySelect happens to be almost identical to A15 doesn't really change what it is, similar to how the Master System has both A15 and /M0-7 on its cartridge connector.

(edit: Even more so, when you consider that the Power Base Converter used the same signal from the Genesis for both when running SMS games. Given the shared 8080-ish ancestry, it makes me inclined to personally rename the two signals on the DMG's card edge to be /M0-7 and /MA-E)
On GB there won't be any other device with A[15..13] = 101 that would drive the bus at the same time, but one practical thing about not using CS is that you might end up doing some unnecessary extra switching (-> extra power consumption).
For example, reading from or writing to $2000 would look like a read from $A000 during the first 1 or ½ CLK respectively...
Last edited by lidnariq on Thu Jun 21, 2018 6:56 pm, edited 1 time in total.
qwertymodo
Posts: 775
Joined: Mon Jul 02, 2012 7:46 am

Re: Verilog MBC5

Post by qwertymodo »

lidnariq wrote:
qwertymodo wrote:Here's a logic trace of OAM DMA that gekkio sent me that supports this. Normal memory accesses do seem to pulse /CS on each access though.
Ah. Thank you.
Just to clarify, I don't believe that trace was from a GBA, but from a DMG executing OAM DMA. So maybe it isn't directly relevant to your statement, but it is an example of accessing memory without pulsing /CS, which is relevant to FRAM mods, since you can't always rely on that pulse to occur.
User avatar
NovaSquirrel
Posts: 483
Joined: Fri Feb 27, 2009 2:35 pm
Location: Fort Wayne, Indiana
Contact:

Re: Verilog MBC5

Post by NovaSquirrel »

Code: Select all

module MBC5
  (
    input not_reset,
    input not_cs,
    input not_wr,
    input [7:0] data,          // data bus value
    input [3:0] address,       // top 4 bits of address
    output [8:0] out_rom_bank, // 9-bit ROM bank to use
    output reg [3:0] ram_bank, // 4-bit RAM bank to use
    output out_ram_enable      // active low RAM enable
  );
  reg [8:0] rom_bank;        // currently selected ROM bank
  reg ram_enable;            // is RAM enabled?

  always@(posedge not_wr or negedge not_reset) begin
    if (!not_reset) begin
      ram_enable <= 0;
      rom_bank <= 1;
      ram_bank <= 0;
    end
    else if(not_cs) begin
      case(address) // select based on top 4 bits of address
        0, 1:
          ram_enable <= data == 8'h0a;
        2:
          rom_bank[7:0] <= data;
        3:
          rom_bank[8] <= data[0];
        4, 5:
          ram_bank <= data[3:0];
        // anything 6 or above ignored
        endcase
     end
  end

  // select fixed bank or switchable bank
  assign out_rom_bank = address[2] ? rom_bank : 0;
  // select external RAM at 0xA000 - 0xBFFF
  assign out_ram_enable = !(ram_enable && !not_cs && !address[2]);
endmodule
Modified version that sets the starting bank to 1, and uses not_cs, though I'm still not 100% sure the configuration registers are clocked right.
lidnariq
Posts: 11429
Joined: Sun Apr 13, 2008 11:12 am

Re: Verilog MBC5

Post by lidnariq »

in this post, Great Hierophant wrote:.ws - Wisdom Tree Mapper (Uses a 74'377)
Curious, I looked more closely.
lidnariq wrote:Drop standard MBC 16F+16 banking and you can get down to 1 IC and back up to 512 KiB. (And as memblers pointed out with GTROM, one can emulate 16F+16 at programming time given a double-size ROM anyway).
For reference, this appears to be Wisdom Tree's Game Boy mapper.

Writes: A~[q0.. .... bbbb bbbb] -- select 32K bank at $0000.

(q - per the description, the 74'377 appears to be connected to /WR and A14, ignoring "A15" and /CS. According to Gekkio's timing diagrams, /WR doesn't toggle during access to the internal memory regions, so the register will be at $0000-$3FFF and $A000-$BFFF)


At least, if I'm reading MAME's source correctly:

Code: Select all

READ8_MEMBER(gb_rom_wisdom_device::read_rom)
{
        return m_rom[rom_bank_map[m_latch_bank] * 0x4000 + offset];
}

WRITE8_MEMBER(gb_rom_wisdom_device::write_bank)
{
        if (offset < 0x4000)
                m_latch_bank = (offset << 1) & 0x1ff;
}
(Yes, * 0x4000, but also << 1)


edit: MAME's source appears to be erroneous, or Wisdom Tree had multiple variations of the PCB. The PCB used by Spiritual Warfare requires that A15 stay low, not A14.
Post Reply