Was there an NES expansion chip like SA-1

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

lidnariq
Posts: 10273
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Was there an NES expansion chip like SA-1

Post by lidnariq » Tue Feb 09, 2021 5:23 pm

stan423321 wrote:
Tue Feb 09, 2021 4:57 pm
The memory speed facts are making me doubt whether it is a thing that would actually happen, but... suppose we have 16-bit ROMs and RAMs instead of 8-bit ones. What can we do with those other than trivial odd/even page selection?
I've had a bunch of ideas for ways to dynamically generate new pattern data, but they're only really useful when you're limited on the amount of ROM you can access. (In practice, that can only be because larger memories are too expensive).

For example:
You connect the output of the ROM to a barrel shifter, allowing tiles to shift left-right within the tile.
... or any other function mapping some data from the ROM to the result, such as left-right reflection.
You can change how the row-of-tile lines from the PPU connect to the ROM, allowing up-down reflection or shifting things up and down within the tile.
You can change how the bitplane selector is mapped, allowing palette swaps within a tile (simplest is swapping colors #1 and #2)
With 16 bits instead of 8, we can encode all sorts of crazy stuff to happen when a byte is read.
You might find the KimKlone an interesting read.
Column offsets. This would look really cool, but would also involve recreating most of PPU's addressing logic on the cartridge. Note that simple column offsets would fit fine in attribute space, if you're not using it, because of EXRAM or something. Also note that with addressing logic recreated, a split assist function would be easy to implement, fixing the two prefetched tiles and reducing h-blank write needs to horizontal scroll.
Vertical offset-per-tile (SNES) / "each 2cell vscroll" (Genesis) would fit fine in something like MMC5. It's basically identical to MMC5's left-and-right split screen, but 33 times instead of just twice.

User avatar
Memblers
Site Admin
Posts: 3899
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Re: Was there an NES expansion chip like SA-1

Post by Memblers » Wed Feb 10, 2021 1:19 am

I'm loving this thread, good stuff.

Regarding OAM DMA, if your mapper is watching the bus and is fully in control of CHR-RAM (big IFs, I know), there's no need to hijack the CPU bus. Other than for palette setup, you'll not need to touch $2006/$2007 when you have a VRAM port in the mapper. Do the DMA during vblank, simply snag the data off the bus, before doing the actual OAM DMA. But when one has this advanced of a mapper, there might be better ways to move stuff into that VRAM, anyways.

I've always seen people fawning over the idea of mapping CHR-RAM into the CPU space. It sounds nice, but the loop with indexed writes into that can be really slow. One would be better off with an auto-incrementing port into VRAM, similar to $2006/$2007. And you'd likely be using it only during vblank to avoid screen tearing.. this sounds like a very familiar setup, doesn't it? haha.

The extended tile addressing for nametables is cool when combined with 8x8 attribute tables, but if one is using 16x16 attributes I also really like the idea of automatic metatiles. You can get 256 16x16 metatiles, but the nametable entries are still only 8 bits wide. If that matters.

6 years ago I had some fun writing up specs for what's basically the cartridge equivalent of a battleship. The Squeedo Jr. I posted about previously is a massively cost-reduced version of this:
https://docs.google.com/document/d/1U33 ... sp=sharing
That design didn't happen for multiple reasons, massive R&D cost for one obvious thing, I was going to license some IP for features that (surprisingly) made it almost a viable product for end-users, but the licensor backed out (in short, I learned why you sometimes need things like, NDAs and stuff beyond verbal agreements.. it's probably all for the better in this case though), but thankfully that happened before I put too much more time and money into the thing. If anyone is interested in audio examples, at least that part exists, recordings made during development of that stuff is here.

stan423321
Posts: 35
Joined: Wed Sep 09, 2020 3:08 am

Re: Was there an NES expansion chip like SA-1

Post by stan423321 » Wed Feb 10, 2021 5:28 am

lidnariq wrote:
Tue Feb 09, 2021 5:23 pm
I've had a bunch of ideas for ways to dynamically generate new pattern data, but they're only really useful when you're limited on the amount of ROM you can access. (In practice, that can only be because larger memories are too expensive).

For example:
You connect the output of the ROM to a barrel shifter, allowing tiles to shift left-right within the tile.
... or any other function mapping some data from the ROM to the result, such as left-right reflection.
You can change how the row-of-tile lines from the PPU connect to the ROM, allowing up-down reflection or shifting things up and down within the tile.
You can change how the bitplane selector is mapped, allowing palette swaps within a tile (simplest is swapping colors #1 and #2)
Ah yes, flipping background tiles would work well. Not sure what would be the use of in-tile shifts for BG tiles, though, color manipulation would at least help giving four players separate colors.
lidnariq wrote:
Tue Feb 09, 2021 5:23 pm
With 16 bits instead of 8, we can encode all sorts of crazy stuff to happen when a byte is read.
You might find the KimKlone an interesting read.
I do, thank you! That design seems to have an opcode fetch line, single source of bus traffic, and empty address space to work with, but it's very much the kind of support circuitry I was imagining. I'll have to take a deeper read later.
lidnariq wrote:
Tue Feb 09, 2021 5:23 pm
Vertical offset-per-tile (SNES) / "each 2cell vscroll" (Genesis) would fit fine in something like MMC5. It's basically identical to MMC5's left-and-right split screen, but 33 times instead of just twice.
This may very well be why I stumbled upon this, now that I think about it.
Memblers wrote:
Wed Feb 10, 2021 1:19 am
Regarding OAM DMA, if your mapper is watching the bus and is fully in control of CHR-RAM (big IFs, I know), there's no need to hijack the CPU bus. Other than for palette setup, you'll not need to touch $2006/$2007 when you have a VRAM port in the mapper. Do the DMA during vblank, simply snag the data off the bus, before doing the actual OAM DMA. But when one has this advanced of a mapper, there might be better ways to move stuff into that VRAM, anyways.
Yeah. My idea with the bus hijack involved actual game logic running on cart (hence SA-1 in thread title calling out to me), while 2A03 would focus on regular writes to $4011 (DMCLOAD?) to compensate for lack of audio pins. That's in 2A03, so I didn't even consider hijacking this one. The complication is that OAM DMA takes 513+ cycles, so a manual loop would be required to pepper DMCLOAD writes in. With hypothetical mapper support for generating the 6 cycle byte write stream, there's enough time to overwrite OAM and palette during the vblank while writing to DMCLOAD every 36 cycles, maybe every 30 cycles if you do something clever. The point of bus hijack would be to do that faster.
Memblers wrote:
Wed Feb 10, 2021 1:19 am
I've always seen people fawning over the idea of mapping CHR-RAM into the CPU space. It sounds nice, but the loop with indexed writes into that can be really slow. One would be better off with an auto-incrementing port into VRAM, similar to $2006/$2007. And you'd likely be using it only during vblank to avoid screen tearing.. this sounds like a very familiar setup, doesn't it? haha.
The part about addressing, that's fair. But in a scrolling game updating the outside edges during rendering should work just fine, I think?
Memblers wrote:
Wed Feb 10, 2021 1:19 am
6 years ago I had some fun writing up specs for what's basically the cartridge equivalent of a battleship. The Squeedo Jr. I posted about previously is a massively cost-reduced version of this:
https://docs.google.com/document/d/1U33 ... sp=sharing
That design didn't happen for multiple reasons, massive R&D cost for one obvious thing, I was going to license some IP for features that (surprisingly) made it almost a viable product for end-users, but the licensor backed out (in short, I learned why you sometimes need things like, NDAs and stuff beyond verbal agreements.. it's probably all for the better in this case though), but thankfully that happened before I put too much more time and money into the thing. If anyone is interested in audio examples, at least that part exists, recordings made during development of that stuff is here.
Huh. Well, gotta check out those as well. I did not expect USB in mapper context.

lidnariq
Posts: 10273
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Was there an NES expansion chip like SA-1

Post by lidnariq » Wed Feb 10, 2021 10:55 am

stan423321 wrote:
Wed Feb 10, 2021 5:28 am
Ah yes, flipping background tiles would work well. Not sure what would be the use of in-tile shifts for BG tiles,
My original idea was to group sets of tiles (e.g. one 1KB bank, 64 tiles) into a rectangle and be able to apply scrolling within that rectangle, for parallax effects.

The down side is that it requires that you bake your parallax plane as an uncompressed 2bpp picture into the CHR, compared to a real second nametable and having the cart hardware dynamically compose the parallax plane into the cut-outs on the real nametable.

stan423321
Posts: 35
Joined: Wed Sep 09, 2020 3:08 am

Re: Was there an NES expansion chip like SA-1

Post by stan423321 » Thu Feb 11, 2021 1:42 am

Ah, okay, shifting with wider input definitely makes sense.

I have stumbled upon a few minor observations in the meantime.
  • While previously I considered storing column offset data in the attribute table space when in any mode that would generate palette indices otherwise, leaving it for more tiles may be a more advantageous setup for the regular case. The mapper could assist in misleading the PPU to read the data as tile numbers without a need for an IRQ, effectively constructing e.g. a 64x32 nametable. Though if that's all you want, an IRQ may be cheaper.
  • Many advanced mappers have mapping modes with various bank sizes, but an overpowered mapper could hypothetically do just with the smallest bank size and have commands to switch many banks at once instead. Whether this is an improvement, I honestly can't tell. Sounds like a flexible combination, but could cause bugs.
  • I did not realize mid-screen forced blank OAM DMA was a doable thing, which renders the DMC pump setup not just expensive, but also possibly unwanted. It would probably need an off switch, at the very least. I can't even start imagining what computations would be needed to get acceptable sound approximation during DMA, so... one or the other?
  • A really overpowered mapper, but one still conceivable in the past, could overlap four multiple tiles, which when combined with shifting would allow a huge set of limited sprite-like objects. Single palette, no major overlapping between themselves, at least 8x8 size so that each gets a corner between tiles for itself.

Oziphantom
Posts: 1080
Joined: Tue Feb 07, 2017 2:03 am

Re: Was there an NES expansion chip like SA-1

Post by Oziphantom » Thu Feb 11, 2021 9:07 am

Memblers wrote:
Wed Feb 10, 2021 1:19 am
I've always seen people fawning over the idea of mapping CHR-RAM into the CPU space. It sounds nice, but the loop with indexed writes into that can be really slow. One would be better off with an auto-incrementing port into VRAM, similar to $2006/$2007. And you'd likely be using it only during vblank to avoid screen tearing.. this sounds like a very familiar setup, doesn't it? haha.
Nah, direct access wins over port access every day. Sure having port for linear speed is nice. But when you want to make a vertical line, direct is better. When you want to animate parts of the map on the screen (something NES games particularly fail to do) direct is best. When you want to do Parallax scroll effect with tile shifts, direct is best. But as you point out it needs to be pair with IRQ raster counter so you can correctly race or trace the beam, which a sensible well put together device would have.
The Port works because you spend time and RAM setting up as many rows as possible and then blat them through during VBlank. So while the copy to VRAM is faster for some cases, you also have to factor in the time to set up the pop slide and copy the data you want into to it, rather than "just write to the destination".

lidnariq
Posts: 10273
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Was there an NES expansion chip like SA-1

Post by lidnariq » Thu Feb 11, 2021 1:34 pm

Another idea I'd floated in the past was the idea of having half of CHR address space contain 64 32-byte banks, so that instead of specifying sprite tiles at the "upload to OAM" stage, you'd instead store unique numbers into tile number in OAM and let the bankswitching hardware choose out of all the sprite tiles.

(edit) Similarly, you could address sprite pop-on on the left by shifting the graphic data when it's fetched by the PPU (tell PPU X=0, tell mapper to shift things n pixels to the left, filling with transparency). You could also partially address sprite pop-on on the top in a similar manner, except that you might run into overdraw problems ... and it'd be irrelevant for NTSC systems anyway.

But this is another improvement that requires being careful around not causing tearing.

stan423321
Posts: 35
Joined: Wed Sep 09, 2020 3:08 am

Re: Was there an NES expansion chip like SA-1

Post by stan423321 » Thu Feb 11, 2021 2:19 pm

Right. I kind of considered this as a part of the "sprite insanity" package deal. A neat part of mapper doing some sort of live conversion at OAM DMA is that it would provide a perfect synchronization point for updating those bank offsets. I guess cutting off the left part just seemed like a funnier possibility to highlight, both because the players are unlikely to notice on authentic enough displays and because it is something that would be stupidly complex to implement without hardware support.

And while 64 is enough, it could be interesting to see if games with tall sprites tend to duplicate subpatterns.

Returning to cut-offs for a second, I wonder if there'd be a gameplay use for something along the lines of simplified SNES window, that is forcing background color on the background outside of given horizontal range. It sounds like it would be mostly useful for level openings and such, but maybe I'm missing something.

User avatar
aquasnake
Posts: 207
Joined: Fri Sep 13, 2019 11:22 pm

Re: Was there an NES expansion chip like SA-1

Post by aquasnake » Tue Feb 16, 2021 8:27 am

domgetter wrote:
Sun Feb 07, 2021 6:04 pm
lidnariq wrote:
Sun Feb 07, 2021 12:37 pm
aquishix wrote:
Sun Feb 07, 2021 12:15 pm
MicroBankswitcher - 2 byte bank in PRG.
What's the intended use of this?
The only current use in our project is for using programmer-defined lookup tables. Say, for example, you've generated a table of log for all fixed point values from 00000000.00000000 to 11111111.11111111. The 2-byte microbankswitcher would allow you to supply the input 2 byte fixed point value, then immediately read the 2 byte result. It will save having to figure out which 8k bank to switch to and what offset to use to grab the bytes.

I'm sure people will figure out other uses for it, though.
The extended attribute mode of mmc5 provides an independent 4KB bank switching capability for each tile, that is, in single screen mode, each tile of nametable can find an offset address under different 4KB pattern tables. The tile itself is 16 bytes.

2-byte bankswitching I guess to go a step further, that is to implement independent bank switching for each line (8 pixels) of each tile, and the 8 pixels of each line of one tile is 2 bytes in the pattern table. If the 6 bits of 4KB bank register are set as the upper bits, the 8 bits of the index number (0-255) in the pattern table are set as the middle bits, and the 3 bits of each line of a tile (0-7) are set as the lower bits, then the 2-byte fine bankswitch capability is realized. Each line of 32X30 background tiles on the screen can be switched every 8 ppu cycles!

For example, to achieve the effect of line by line distortion similar to flame combustion, it is necessary to generate repeated tiles by bank switching, 2-byte switching is more conducive to control the total occupation of Chr-ROM.

User avatar
aquasnake
Posts: 207
Joined: Fri Sep 13, 2019 11:22 pm

Re: Was there an NES expansion chip like SA-1

Post by aquasnake » Wed Feb 17, 2021 6:40 am

When monitoring ppu_data and generate a new ppu_data_out, the data obtained can be placed in reverse order, and the tiles can also be mirrored. The ROM utilization rate of tiles is considerable. 30 years ago, that was definitely a cost-effective technology, but now, nor flash is not a limiting factor. We can directly implement it with plat addresses, just pre generate the static pattern table for the materials to be rendered

For BG tiles, special effects such as mirror, flip and twist can be realized, while zooming and rotation can't be done. This might be the idea put forward on FC before the development of SFC, and it must have been realized partially at that time

stan423321
Posts: 35
Joined: Wed Sep 09, 2020 3:08 am

Re: Was there an NES expansion chip like SA-1

Post by stan423321 » Wed Feb 17, 2021 7:09 am

I mean, someone did the Pi cart after all, so if you don't try to consider limitations of the time things eventually get unconclusive.

Post Reply