While preparing firmware v0.1.7c for release I stumbled upon the Cx4 again - Rockman died in attract mode once again. After some nights of fiddling it turns out my timing is ass, and the Cx4 can pull off some more stuff than what the two games do. Bit of a shame actually it wasn't used for more stuff.
Anyway this should cover most of the unknown stuff about MMIO registers, pins, internal registers, instruction timing, cartridge RAM access, DMA, and some oddball stuff.
I did not touch unknown instructions and flags so far.
I tried to organize my notes a bit but it's probably still a mess, please ask if you are confused or need to know anything
So without further ado, here are the notes. Also available at https://sd2snes.de/files/cx4_notes.txt
Cx4 notes by ikari_01 <firstname.lastname@example.org>
-> Version: 0.2
- add clarification on memory mapping
- point out that the CPU is halted until caching is complete
- cart bus only claimed on actual bus operations
- add register $7f48
- correct pin mapping (74 and 75 were swapped)
Version: 0.1 (initial)
These notes add some information about previously unknown/undocumented aspects
of the Capcom Cx4 custom chip. It is NOT a complete documentation of the Cx4
but adds bits of information missing in existing documentation.
They were compiled while working with the Cx4 and are a bit chaotic, please
ask if anything is unclear.
Pin 74: global memory output enable
Cx4 still exposes its MMIO and internal RAM etc. to the bus if this is
high, but no ROM or RAM connected to it.
Probably for use with cart ROM/RAM connected "alongside" the Cx4 but
independent of it.
Pin 75: Map select (0=LoROM; 1=HiROM)
LoROM mapping is widely known. Cart RAM is mapped at 70-7f:0000-7fff.
HiROM mapping is a bit botched at least on the MMX2 PCB. SNES A15 becomes A20
to the ROM, all other address lines are shifted down by one to close the gap.
So the mapping S-CPU -> Cart ROM goes as follows:
C0:0000 => 0x000000
C0:8000 => 0x100000
C1:0000 => 0x008000
C1:8000 => 0x108000
C2:0000 => 0x010000
DF:7FFF => 0x0FFFFF
DF:FFFF => 0x1FFFFF
ROM content must be rearranged to match, or rewired.
As a plus, in HiROM mode it is possible to use 32MBits of cart ROM in two
16Mbit chips (by leaving $7f52 at $01), from E0:0000 onward the second ROM
will be selected.
In HiROM mode ROM is mapped as follows (assuming $7f52 = $01)
00-3F:8000-FFFF ROM1 0x100000-0x1FFFFF, ROM2 0x100000-0x1FFFFF
40-7D:0000-FFFF NOTHING (open bus with a bit of noise)
80-BF:8000-FFFF ROM1 0x100000-0x1FFFFF, ROM2 0x100000-0x1FFFFF
C0-FF:0000-7FFF ROM1 0x000000-0x0FFFFF, ROM2 0x000000-0x1FFFFF
C0-FF:8000-FFFF ROM1 0x100000-0x1FFFFF, ROM2 0x100000-0x1FFFFF
Cart RAM mapping:
HiROM: 30-3F:6000-7FFF, B0-BF:6000-7FFF
LoROM: 00-3F:6000-7FFF, 80-BF:6000-7FFF
HiROM: 00-2F:6000-7FFF, 80-AF:6000-7FFF (to make room for Cart RAM)
DMA source and destination CAN reference the same bus but not the same chip,
e.g. cart ROM <-> cart RAM is allowed but cart RAM <-> cart RAM isn't!
Same-bus DMA takes WS1+WS2 extra waitstates per cycle.
DMA from/to internal RAM only takes WS1 or WS2 extra waitstates depending
on the referenced mapping area.
Neither DMA source nor destination may point to unmapped areas (-> lockup).
ROM is disallowed as a DMA destination (-> lockup).
CPU misc. (caching)
Cx4 has two program cache pages. They are the only way it can execute code,
the CPU CANNOT run directly from cart ROM/RAM.
The pages have tags to indicate what program page (from ROM) they currently
contain. The CPU will use these to determine whether a jump across page
boundaries requires re-buffering of one of the cache pages.
If execution reaches the end of cache page 0 and there is no STOP instruction,
page 1 will be buffered (according to contents of P register?) and execution
continues. On end of page 1, execution halts (implied STOP instruction) and
CPU goes idle.
During caching of a program page the CPU is halted until all bytes are copied.
For more details on caching see $7f4c.
For cartridge ROM/RAM access, there are two different configurable waitstate
counts (called WS1+WS2 here; see $7f50 below).
WS1 applies to cart ROM, WS2 applies to cart RAM.
(for sake of completeness in conjunction with $7f47, $7f40-$7f46 are listed
$7f40: DMA source low byte
$7f41: DMA source high byte
$7f42: DMA source bank
$7f43: DMA length low byte
$7f44: DMA length high byte
$7f45: DMA destination low byte
$7f46: DMA destination high byte
$7f47: DMA destination bank (!)
ALSO: Trigger GPDMA (BUS<->internal map)
Trigger program page caching
0: Page select (0/1)
This preloads a cache page with bus data (cart ROM/RAM) pre-set by the
offset select ($7f49-$7f4b) and program page select ($7f4d-$7f3e)
registers. The appropriate number of waitstates for the designated
memory type applies.
1: cache page 1 lock (1=locked)
0: cache page 0 lock (1=locked)
The cache page lock flags are used to prevent the CPU from buffering
to the corresponding cache page at runtime (e.g. when the pgm_page
register is prepared and a JMP/CALL P instruction is executed).
The cache pages can still be filled by writing to $7f48.
This is more or less a tuning mechanism for the developer who can decide
to keep certain code cached at all times.
Several constellations must be considered when code is executed in one
of the cache pages:
no pages locked:
If the other page already contains the program page required, the CPU
will just jump there. Otherwise the other page will be loaded with the
program page contents from ROM prior to jumping and its tag will be
either of the pages locked:
The other page cannot be used for buffering so unless it already
contains the desired program page the same page is used, overwriting
the code that is currently executed. If a RET occurs, the previous
program page is swapped back in. This requires reading 512 bytes from
cartridge ROM every time so it can get very slow.
both pages locked:
ONLY the program pages that have been pre-cached by writing $7f4d/e and
$7f48 are available to the CPU. If a different program page is requested
prior to execution either by $7f4d/e -> $7f4f, or by a JMP/CALL P
instruction at runtime, execution will stop immediately.
$7f50: 7: -
6-4: WS1 (ROM read waitstates) (0-7)
2-0: WS2 (cart RAM read/write waitstates) (0-7)
$7f51: 0: IRQ ack / inhibit
(write 1 to ACK and disable further IRQ
write 0 to enable IRQ)
$7f52: ROM configuration select
LoROM: 0: 2x 8Mbit (A21 switches between ROM /CE1 and /CE2)
1: 1x 16Mbit (maybe A22 switches but 40-7f/c0-ff are inactive)
HiROM: 0: 2x 8Mbit (A20 switches)
1: 2x 16Mbit (A21 switches)
$7f53: READ (mirrors: $7f54-$7f57, $7f59, $7f5b-$7f5f)
7: CPU is accessing ROM bus (SNES cut off)
6: CPU is running
1: IRQ pending
0: Cx4 suspended (see $7f55-$7f5d)
WRITE (no mirrors)
Any write access returns the Cx4 to idle state immediately - useful to
recover from infinite loops ;)
$7f55: Any write access indefinitely suspends the Cx4 (registers can be
read and written but the CPU shows no reaction: no buffering occurs,
no code is run and $7f53 is not updated).
Cx4 status bit 0 is set.
$7f56: Any write: Suspend Cx4 for 32 cycles ( 1.6µs @20MHz)
$7f57: Any write: Suspend Cx4 for 64 cycles ( 3.2µs @20MHz)
$7f58: Any write: Suspend Cx4 for 96 cycles ( 4.8µs @20MHz)
$7f59: Any write: Suspend Cx4 for 128 cycles ( 6.4µs @20MHz)
$7f5a: Any write: Suspend Cx4 for 160 cycles ( 8.0µs @20MHz)
$7f5b: Any write: Suspend Cx4 for 192 cycles ( 9.6µs @20MHz)
$7f5c: Any write: Suspend Cx4 for 224 cycles (11.2µs @20MHz)
These registers can be used to obtain guaranteed access to ROM/RAM from the
SNES side while the Cx4 is running. CPU and/or ongoing DMA transfers are
$7f5d: Any write clears the Cx4 suspend flag and the chip becomes
responsive again (presumably resumes execution).
$7f5e: Any write clears the IRQ pending flag WITHOUT touching the actual
cart IRQ signal (remains low)
If IRQ is enabled, /IRQ goes high->low when the Cx4 CPU stops, and stays low.
Software must ACK by writing 1 to $7f51 bit 0 -> /IRQ will go high again.
Software must then write 0 again to re-enable IRQ triggering for the next
$20: PC (PC of current instruction + 1)
$28: ??? (always seems to return $2e)
$2e: cart ROM bus port (triggers cart ROM reads),
to be used with $61 opcode!
Waitstates = $7f50 bits 6-4.
$2f: cart RAM bus port (triggers cart RAM reads/writes),
to be used with $61 / $e1 opcode!
Waitstates = $7f50 bits 2-0.
$70-$7f are mirrors of $60-$6f ("R0-R15").
internal register address appears to be 7-bit, e.g. $e0-$ff are the same
1 cycle = 50ns (@20MHz) duh
The vast majority of instructions execute in a single cycle.
Exceptions/noteworthy details are:
- jmp/call takes 1 cycle if branch not taken, 3 cycles if taken
(regardless of p flag, crossing page boundaries comes at no extra cost)
- ret takes 3 cycles
- skip takes 1 cycle for itself, but it makes the skipped instruction count
for 1 cycle (injected NOP or equivalent)
- internal data rom/ram access takes 1 cycle only.
- cart ROM access from Cx4 code:
* Cartridge ROM is accessed by reading from register $2e to a special
internal register (fullsnes: ext_dta). (Opcode: $612e)
* The read itself is 0-waitstate and executes in a single cycle. However
the result will not be valid before the appropriate number of waitstates
is reached and the data is actually pulled in from the ROM.
* The CPU may execute other code in the meantime.
* To stall the CPU until the ROM read operation is complete, a wait
instruction can be issued ($1c00).
* The external bus address does not auto-increment; to do so, a special
instruction can be issued ($4000). There may be a decrement instruction
as well (as of yet unknown). It is useful to do this before the wait
instruction to save a cycle.
* The number of waitstates can be configured by setting $7f50 bits 6-4.
- cart RAM access from Cx4 code:
* Cartridge RAM is accessed by reading from register $2f to a special
internal register, or by writing to register $2f from the same.
(Opcode: $612f (read) / $e12f (write))
* Access handling appears to be the same as for cart ROM, and the
same for reading and writing:
issue read/write; alter bus address; wait for complete (or do
something else in the meantime)
* The number of waitstates can be configured by setting $7f50 bits 2-0.
Cartridge bus is only claimed by the Cx4 when DMA or caching occurs, or a bus access
is carried out by Cx4 code. At all other times the SNES address+data buses are forwarded
to the ROM/RAM even if the Cx4 CPU is running.
As of yet untouched. higan seems to do a decent job at them already.
scribble / internal notes from testing
This is probably useless but eh.
Cx4 code base: 01:8000
PC always 00
1 cycle = 50ns
255 NOP + 1 STOP in 12.8us -> 50ns/inst -> 1 cycle/inst
255 JMP + 1 STOP in 38.4us -> 150ns/inst -> 3 cycles/inst
255 MOV -> A/R0 + 1 STOP in 12.8 us -> 50ns/inst -> 1 cycle/inst
255 RDWR ROM/RAM + 1 STOP in 12.8 us -> 50ns/inst -> 1 cycle/inst
255 RDBUS (612E) -> CRASH
127 RDBUS+WAIT + 1 STOP in 32us -> 250ns/pair -> 5 cycles total (WS = 4)
127 RDBUS+WAIT + 1 STOP in 19.2us -> 150ns/pair -> 3 cycles total (WS = 2)
127 RDBUS+WAIT + 1 STOP in 12.8us -> 100ns/pair -> 2 cycles total (WS = 1)
127 RDBUS+WAIT + 1 STOP in 12.8us -> 100ns/pair -> 2 cycles total (WS = 0!)
127 RDBUS+INC + 1 STOP in 12.8us + CRASH -> 100ns/pair -> 2 cycles total (WS = 4!!!!!)
85 RDBUS+INC+WAIT + 1 STOP in 21.4us -> 250ns/triple -> 5 cycles total (WS = 4)
85 WRBUS?!+INC+WAIT + 1 STOP in 21.4us -> 250ns/triple -> 5 cycles total (WS = 4)
00 cx4_00_nop.bin 1
01 cx4_08_jmp.bin 3
02 cx4_64_mova.bin 1
03 cx4_e0_movr0.bin 1
04 cx4_70_rdrom.bin 1
05 cx4_68_rdram_r0.bin 1
06 cx4_6c_rdram_imm.bin 1
07 cx4_e8_wrram_r0.bin 1
08 cx4_ec_wrram_imm.bin 1
09 cx4_40_rdbus_wait.bin ?!
0a cx4_40_rdbus_nowait.bin 1
0b cx4_24_skip_unknown.bin 1
0c cx4_25_skip_nc.bin 1
0d cx4_25_skip_c.bin 1
0e cx4_28_call.bin 3
0f cx4_81_add_shl1.bin 1
10 cx4_61_bustest.bin - (1)
11 cx4_61_bustest_1c.bin 1+WS (1 cycle 612e; ->WS cycles 1c00)
12 cx4_61_bustest_40.bin 2+crash (1 cycle 612e; 1 cycle 4000)
13 cx4_61_bustest_401c.bin 2+(WS-1) (1 cycle 612e; 1 cycle 4000; ->WS cycles 1c00)
14 cx4_e0_bustest_401c.bin 1
15 cx4_e1_bustest_401c.bin 1
16 cx4_e12f_bustest_401c.bin 2+WS2-1 !!!!
17 cx4_612f_bustest_401c.bin 2+WS2-1 !!!!
18 cx4_e02f_bustest_401c.bin 3 (doesn't work!)
19 cx4_0a_jmp_p1.bin 3 (P test not applicable - see 1d+1e)
1a cx4_0a_jmp_p2.bin 3 (P test not applicable - see 1d+1e)
1b cx4_0c_jz.bin 1 (not taken), 3 (taken)
1c cx4_10_jc.bin 1 (not taken), 3 (taken)
1d cx4_3c_ret.bin 3
1e cx4_2a_call_p.bin 1 (not taken), 3 (taken)
1f cx4_25_skiptest_nc.bin 1 (not taken), 2 (taken)
20 cx4_25_skiptest_c.bin 1 (not taken), 2 (taken)
23 cx4_xx_infloop.bin -
24 cx4_xx_busloop.bin (for 0x80 flag testing)
25 cx4_xx_dumpflags.bin (hopefully)
26 cx4_f0_xchg.bin 1
27 cx4_xx_opdump.bin -
28 cx4_xx_flagloop.bin -
29 cx4_xx_opdump2.bin -
2a cx4_xx_opdump3.bin -