Mapper Features for High Level Languages

Discuss hardware-related topics, such as development cartridges, CopyNES, PowerPak, EPROMs, or whatever.

Moderator: Moderators

User avatar
qbradq
Posts: 972
Joined: Wed Oct 15, 2008 11:50 am

Mapper Features for High Level Languages

Post by qbradq »

Here's a thought or two about possible mapper implementations that would help when making higher-level languages for the NES. I hope if anyone is working on new mappers (which I am not BTW), you'll take these into consideration.

Use address lines for register state transfer, not data lines

For instance, let's say you have eight bits of state per register. A conventional mapper might use the eight data bus lines for state transfer, and A14-A12 for function selection. This allows function state transfer such as this:

Code: Select all

lda value
sta function_address
This is convenient for assembly programmers and useful when value is dynamic. However, there's a better approach: use A0-A7 for state transfer, and A14-A12 for function selection. This allows dynamic function state transfer like this:

Code: Select all

ldx value
sta function_address,x
And static state transfer, where value is known at assembly time, can be done in a single instruction without disturbing the contents of any register:

Code: Select all

sta function_address+value
This allows far method calls to be done in a thread-safe and efficient manor, and will help ease pains of high-level language designers when dealing with limited contiguous address space.

Random access mass data storage

The ability to access a mass storage device like a serial flash chip or SD card via a byte-wide mapper register would be optimal for accessing data. Any dynamically loaded data like level layouts can easily be read and shoved into WRAM. This means less parallel storage is required (128kb should be plenty for program code - data) and more content can be placed into the game without the worry or hassle of compression and storage constraints.

With these two points in mind, I guess you'd use high address lines for function select and A0-A2 for bank select on a given function.
User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Mapper Features for High Level Languages

Post by Bregalad »

Use address lines for register state transfer, not data lines
I believe many pirate mappers uses this way of doing things.
User avatar
infiniteneslives
Posts: 2104
Joined: Mon Apr 04, 2011 11:49 am
Location: WhereverIparkIt, USA
Contact:

Re: Mapper Features for High Level Languages

Post by infiniteneslives »

Those things are possible to do within a single CPLD of today, but you have to consider the benefit/convenience you're gaining comes at the cost of i/o pins. The desire for byte wide access to mass serial storage requires 8 i/o. It's nice to reuse those i/o for register writes and save the few remaining i/o for other more advanced features.

The mapper I'm working on that attempts to cater to some of these desires has the following limitation/pinout assuming the desired device will be the XC95[72/26]XL with it's 34 i/o:
*128KB PRG-RAM with 8KB banking ($6000-$FFFF): (RAM A13-A16, CPU A13, A14, PRG /CE, PRG R/W, M2, RAM /CE) = 10 i/o
*32KV CHR-RAM wth 4KB banking: (RAM A12-14, PPU A12) = 4i/o
*H/V selectable mirroring: (PPU A10,11, CIRAM A10) = 3 i/o
*Byte wide reads (bit banged writes) SPI flash interface ($5000) (full data bus, CPU A12, clk, cs, miso) = 12 i/o

TOTAL i/o: 29

That assumes that the above are all 'requirements' however they don't necessarily need to be. Cutting PRG-RAM or CHR-RAM down in size might be easy to lose depending on developer desires. I will say those are the sizes of the memories that would be on the PCB I'm working on, so you'd be 'tossing' out extra ram space just to gain the i/o's. Due to actual costs of these sram chips there isn't any possible cost savings with smaller RAM spaces.

That leaves 5 i/o to work with, and some things that still need to be handled which may or may not cost i/o:

*Bootrom/mcu: (/CE) = possibly 1 i/o. This could be saved if using a mcu as a non-random accessible 'boot-rom' which dual purposed the spi /cs pin. The mcu and CPLD could be smart enough to know when the 'boot-rom' or spi-flash was being selected. Otherwise this may cost another i/o

*SPI MOSI= possibly 1i/o: This function could be met by simply connecting a single CPU data line to MOSI for bit-by-bit writes to SPI. However most spi flash is 3v so level shifting is needed. The CPLD can perform this for the cost of 1 i/o, otherwise the task should be able to be done by a voltage divider method of level shifting for savings of that i/o

*mapper registers: Need to decode some registers, I had planned on connecting CPU A0 to add some flexibilty to register mapping $5000/1, $8000/1 etc. Due to PRG /CE delay relative to M2 you need to resolve the overlap of $5000-7FFF with $D000-FFFF. I planned on not placing any mapper registers in $D000-FFFF and re-gaining some registers via decoding A0. Without A0 you've got 5 registers which isn't quite enough for 3 PRG banks, 2 CHR banks, and 1 mirroring reg. Although you could add complexity/hardware to resolve the PRG /CE M2 delay, but avoiding conflicts seemed the easiest, cheapest, hassle free solution to me. Either way I think you'll end up loosing another i/o even if you used cpu A11 and kept with the lower bit for state as you're proposing.

*potential cpu/scanline counter = atleast 1 i/o (/IRQ): if there's enough logic and mapper regs available a CPU cycle counter only costs the 1 i/o for /IRQ. A slightly less logic consuming (and possibly more user freindly) scanline counter should be possible with the addition of CHR A13 and CHR /RD signals in order to sense the 4 consecutive NT fetches at the end of each scanline. This would cost the 2 extra i/o and may not fit inside a 36 Mcell CPLD due to logic consumption. When I get it all working I'm going to try and dual purpose the SPI shift register as a scanline count register. If lucky it might fit and consume less logic than a 15-16bit cpu cycle counter.



TL;DR:
So there you go, you've basically got 4-5 i/o to work with. There are 16 PRG-RAM banks which you'd want to be able to map all of them to $6000-7FFF for loading means you need a 4 bit wide register. That'd take A0-A3 consuming all available remaining i/o. So it comes down to if you were presented with the option of lower address bit state latching of bank registers, would you trade it for an IRQ counter? You might be able to regain it if you cut down to 16KB of CHR-RAM or 64KB of PRG-RAM. The PRG-RAM cut gains 2 i/o, and the CHR cut gains 1 i/o. So that could be the other trade-off. You could have your cake an eat it too with 2 CPLDs, but lets stick with the assumption of 1 for now. What would you rather?
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers
User avatar
qbradq
Posts: 972
Joined: Wed Oct 15, 2008 11:50 am

Re: Mapper Features for High Level Languages

Post by qbradq »

I'm not sure which I'd prefer to be honest. I don't have enough experience with large high-level language projects for small systems to know at this point which would be the better of the two options. I'm very glad to see such a thorough cost / benefit breakdown as it applies to current low-cost CPLDs.

One thing that comes to mind is that I'd rather not have three bankable 8kb PRG segments. I'd rather have one bankable 16kb PRG segment and one fixed. The way I'm approaching executable segmentation right now that'd be a better fit, especially if 8KB of WRAM is available and the mass data storage to load from.

I do not have a good understanding of how all this stuff works from a hardware perspective, so I don't know if any of this is terribly feasible, but I am glad to see a bit of discussion about it.

Either way, single-write bank switching is good enough. There's not a huge performance gain in omitting the load, because if you are allowing more than one thread to bank switch you're going to have to keep track of the current bank somewhere anyway.

Tl;Dr - Thanks for the discussion. Don't let this influence working designs.
User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Mapper Features for High Level Languages

Post by Bregalad »

I think you're confusing two things. How registers are mapped/accessed on a mapper, and a high/low level language debate. Those are two completely different and independent things.

Accessing mapper registers is possible in C no matter how they are layout, but it will in the end always write out as something weird/ugly including some of casts and voltaltile keyword. I'd recommend writing the I/O portions of the game in assembly and the logic portion in HLL (when a compiler better than CC65 for the 6502 will be available, if one ever is).
Drag
Posts: 1615
Joined: Mon Sep 27, 2004 2:57 pm
Contact:

Re: Mapper Features for High Level Languages

Post by Drag »

I'm not familiar with what you're trying to do, so how is this beneficial to a high-level language? I'm not opposed to the idea, I'm just having trouble understanding it; especially how it's "thread safe". :P

As far as I'm aware, the only way to make "thread safe" bankswitching is to make the operation atomic. Your single-write bankswitch is perfect for that task, but the two-op variable-based bankswitching isn't any different from "LD? currBank; ST? bankSelect;" (where ? is A, X, or Y), which is hypothetically thread-safe, just as long as the intervening code properly saves and restores the CPU registers.

This is opposed to MMC3, which is harder to make "thread safe" because you have to write to two mapper registers, and it'd be up to the intervening code not only to save A, X, and Y, but also to somehow know how to save and restore the mapper's function-select register (if the intervening code touches it at all, that is).

Please help me understand what you're trying to do so I can see the benefits you're talking about. :P Using the lower A-lines to set mapper registers (instead of the D-lines) is a very new idea to me, and it sounds really interesting. It's also worth noting that I've only ever written NES software in assembly, so if the benefits are blindingly obvious to HLL users, they're lost on me. :P

Edit: Ok, one other advantage would be the ability to easily use more than 8-bits for the bank number. Then couple that with the one-op static bankswitch, and it's pretty nice. I don't understand the HLL or the "threading" though.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Mapper Features for High Level Languages

Post by tepples »

MMC3 is easy to make thread-safe: during the main thread, leave the switchable bank at set at 7 (PRG ROM bank at CPU $A000-$BFFF). Switch bank 6 (PRG ROM bank at $C000-$DFFF) only in the DPCM playback routine, and switch banks 0-5 (CHR ROM) only in the NMI and scanline IRQ handlers.
User avatar
qbradq
Posts: 972
Joined: Wed Oct 15, 2008 11:50 am

Re: Mapper Features for High Level Languages

Post by qbradq »

Really the main advantage I was thinking of is reducing the cycle count for calling a subroutine in a bank that may not already be mapped in. The more I look at the code generation, and the more I read this thread, the more I understand that this is not an operation that's going to be efficient.

I really appreciate the input everyone! This has given me some good food for thought on a project I'm working on.
User avatar
infiniteneslives
Posts: 2104
Joined: Mon Apr 04, 2011 11:49 am
Location: WhereverIparkIt, USA
Contact:

Re: Mapper Features for High Level Languages

Post by infiniteneslives »

qbradq wrote: One thing that comes to mind is that I'd rather not have three bankable 8kb PRG segments. I'd rather have one bankable 16kb PRG segment and one fixed. The way I'm approaching executable segmentation right now that'd be a better fit, especially if 8KB of WRAM is available and the mass data storage to load from.
That's good to hear. I've wondered if that would be acceptable or even preferable considering you've really got ram underneath and not rom. That and you've got around a dozen pages of ram you can place at $6000-7fff Which should make it all okay. One less PRG bank to maintain means more logic available to make a counter fit within 36mcells. ;)
If you're gonna play the Game Boy, you gotta learn to play it right. -Kenny Rogers
User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Mapper Features for High Level Languages

Post by Bregalad »

The more I look at the code generation, and the more I read this thread, the more I understand that this is not an operation that's going to be efficient.
An efficient compiler for the 6502 is only a theoretical thing today. It's possible that an efficient one has been made commercially in the late 80s and early 90s, but who knowns ?

With the experiences I made, CC65 generates assembly codes that is between 8 and 20* times longer/slower than hand written assembly code.

* 20 was for an extreme cases where I applied multiple strength reduction optimisations and loop reversal optimisation, something that CC65 is completely unable to do.
User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Re: Mapper Features for High Level Languages

Post by thefox »

I don't think C is expressive enough to really generate efficient code for 6502. By expressiveness I mean the ability to hint the compiler about how to generate optimal code. I would much rather see a compiler for some kind of a "small C" variant.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Mapper Features for High Level Languages

Post by Bregalad »

True, but it is still possible to do quite a lot of optimisations and "guesses" automatically if some thought and testing has been made.

And what is the difference between C and "small C" exactly ?
User avatar
qbradq
Posts: 972
Joined: Wed Oct 15, 2008 11:50 am

Re: Mapper Features for High Level Languages

Post by qbradq »

Starting new thread in NES Dev forum about Small C.
User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Re: Mapper Features for High Level Languages

Post by thefox »

I want to clarify that when I said "small C" I didn't mean Small-C specifically, but any variation/subset of C in general.
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
User avatar
qbradq
Posts: 972
Joined: Wed Oct 15, 2008 11:50 am

Re: Mapper Features for High Level Languages

Post by qbradq »

Thanks for clearing that up. From an implementation perspective, a "subset of C" could mean a lot of things, like "a superset of assembly". But I'll get more into that this evening in the Small C post.
Post Reply