It is currently Sat Sep 21, 2019 4:28 am

All times are UTC - 7 hours



Forum rules





Post new topic Reply to topic  [ 66 posts ]  Go to page 1, 2, 3, 4, 5  Next
Author Message
PostPosted: Fri Feb 24, 2012 1:02 pm 
Offline

Joined: Fri Feb 24, 2012 12:09 pm
Posts: 1007
Here's is a new decoument with SNES hardware specs,
http://nocash.emubase.de/fullsnes.htm
http://nocash.emubase.de/fullsnes.txt
it should be the most complete SNES specs ever released (unless I've missed something important), covering both the console (based on Anomie's docs), and all existing add-ons, controllers, coprocessors (based on my own research & info found on various webpages; including the nesdev forum)... I hope the doc will be of some use.

Additions (or corrections) would be welcome! There are still hundreds of details marked "unknown", for many of the details it should be quite simple to solve them. For example, in some cases it'd just need somebody taking (hiresolution) PCB photos/scans, or somebody scribbling down signal pinouts, or somebody running some test programs on real hardware add-ons.
NB. if somebody would want to donate some SNES hardware for research purposes - I'd gladly waste my time on doing that part.
cu, Martin


Top
 Profile  
 
 Post subject:
PostPosted: Fri Feb 24, 2012 1:56 pm 
Offline

Joined: Thu Oct 05, 2006 6:29 am
Posts: 917
Nice. Lots of good information gathered into a single easy-to-read document.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 25, 2012 3:42 am 
Offline

Joined: Thu Oct 05, 2006 6:29 am
Posts: 917
Quote:
Color Math can be disabled setting 2130h.Bit6-7


doesn't quite add up with

Quote:
2130h - CGWSEL - Color Math Control Register A (W)
7-6 Force Main Screen Black (3=Always, 2=MathWindow, 1=NotMathWin, 0=Never)
5-4 Color Math Enable (0=Always, 1=MathWindow, 2=NotMathWin, 3=Never)


Shouldn't it be "disabled setting 2130h.Bit4-5"? That's how I would interpret that register description at least.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 25, 2012 8:22 am 
Offline

Joined: Mon Jul 14, 2008 4:02 pm
Posts: 88
Great work, and thanks for furthering research on some of the more obscure devices!
Especially the Satellaview and Super Famicom Box sections are greatly appreciated!


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 25, 2012 9:02 am 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1524
I'm most appreciative for the Japanese controller research, thank you!

I'd figured out that they were serial bit-bang interfaces and logged their data, but never bothered to really tear the games apart to figure out what the actual commands were. I guess it's a good thing I have experience with both synchronous and asynchonous UART simulation :)
But now the question is, how do you simulate a baseball bat, a golf club, a MIDI keyboard, a barcode scanner, an exercise bike, and a horse-race-betting modem with an emulator? ;)

I can help you fill in a lot of holes if you like. d4s, ikari and myself have figured out all of the SGB's ICD2 interface, for one example.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 25, 2012 4:50 pm 
Offline

Joined: Fri Feb 24, 2012 12:09 pm
Posts: 1007
Quote:
Shouldn't it be "disabled setting 2130h.Bit4-5"?

Yes, thanks!
Oh, and the other unclear/dirty part in the color math section: My initial "guess" has been that the hardware would do the blending only if the main-screen is in front of sub-screen (or vice versa)... but now I think that it's totally don't care which pixel is in front... is that correct?

Quote:
Especially the Satellaview and Super Famicom Box sections are greatly appreciated!

Yipieh. For the SFC-Box, seeing a KROM1 dump would be great to solve most missing things, it's a (socketed) EPROM, so one could just unplug & dump it - if one does have the hardware at home.
For the Satellaview the soundlink/voicelink feature is still unclear to me: I guess it does inject the audio directly to the expansion port's audio input(?) (rather than transferring data to APU RAM by software).
And, as I understand (or guess), it isn't interactive. Ie. always playing the same sound/speech, no matter what you are doing in the game. Ie. no "If player wins then play SoundA else play SoundB".
The only "interactive" option seems to enable/disable the sound via I/O ports (so one could mute SoundA, and, after SoundA has ended, unmute it to play SoundB) (or vice versa to hear the other Sound, but doing that would be a fairly annoying solution).
One other feature I've missed is that there seems to be no status info saying what soundlink data (for which game) is transmitted... so if one plays the wrong game at wrong time, then one would hear sounds for a completely different game(?)
Well, I guess only people in Japan know such details. So it may be difficult to find info in english.

Quote:
But now the question is, how do you simulate a baseball bat, a golf club, a MIDI keyboard, a barcode scanner, an exercise bike, and a horse-race-betting modem with an emulator? ;)

Yeah, guess there are some ways, but difficult to implement, or uncomfortable to use (or both at once). I am not really planning to emulate them - for most of them I would be just happy if'd knew how-the-hell they are working (hope somebody will research the missing info!). The piano might be most suitable to emulate - if one would have the sound ROMs dumped.

Quote:
I can help you fill in a lot of holes if you like. d4s, ikari and myself have figured out all of the SGB's ICD2 interface, for one example.

Yes, gladly. Everything about everything you have!
The SGB section should be (I think) already more or less complete(?) I haven't looked into the "controller button sequences" - I guess some of them cannot be manually entered (at correct timing), and they'd work only with the "SGB Commander" joypad... with one "button-stroke" per frame (or per data-transmission; which'd be usually same as per frame)?


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 25, 2012 4:54 pm 
Offline
User avatar

Joined: Thu Jul 29, 2010 2:24 pm
Posts: 58
I managed to see the documentation via Google Cache...

It's almost like my Satellaview RE work is reduced to nothing...

But i have to say, bravo, and seriously, thanks.
Really nice stuff...


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 25, 2012 6:19 pm 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1524
Quote:
The SGB section should be (I think) already more or less complete(?)


Ah, you're right, that was my mistake, sorry. I noticed you didn't have the algorithm ikari_01 came up with for $6001 writes selecting the output for $7800. But it looks like it was my mistake for misunderstanding what $6000 returned. It looked like the pure LY counter from the Game Boy core, not two separate fields.

If d4s could link his SGB doc, there is some additional mirroring differences between the SGB1 and SGB2 ICD2 chips.

On a related note, you were right and that is the ST-0018 program ROM from command $f3. It is definitely 32-bit LE ARM architecture, it looks to be ARMv3. ROM is at $0000:0000+, registers are at $4000:0000+, and we think RAM is at $6000,$a000,$e000:0000+. Still looking into it. I don't have an ARM core handy, so I have to make one first.

The 32K data file from $f4 feels like random line noise so far. It certainly doesn't appear to be a data ROM, and it's too random for uninitialized memory on a cold boot.


Top
 Profile  
 
 Post subject:
PostPosted: Sat Feb 25, 2012 7:28 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21595
Location: NE Indiana, USA (NTSC)
byuu wrote:
But now the question is, how do you simulate a baseball bat, a golf club, a MIDI keyboard, a barcode scanner, an exercise bike, and a horse-race-betting modem with an emulator? ;)

How does an Android phone simulate a barcode scanner? How does Modplug Tracker simulate a MIDI keyboard?

byuu wrote:
On a related note, you were right and that is the ST-0018 program ROM from command $f3. It is definitely 32-bit LE ARM architecture

Hints of GBA before the GBA came out? Both no$gba and VisualBoyAdvance have an ARMv4, and DS emulators have an ARMv5 core.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Feb 26, 2012 12:59 am 
Offline

Joined: Fri Feb 24, 2012 12:09 pm
Posts: 1007
Great, after seeing the 32bit addressing, I hoped it might be ARM, or some other "standard" instruction set. Tepples: Rather nothing to do with GBA or Nintendo - ARM is used by many companies, by Seta, in this case.

The code for the A3 test command is at 00:EF02 (in the SNES ROM, not in the St018 ROM). call E873/E87E are send/recv byte. Ie. should work like so:
send(A3h), recv(E0h)
send(addr0), recv(anything)
send(addr1), recv(anything)
send(addr2), recv(anything)
send(addr3), recv(anything)
repeat 128 times: send(80h..01h), recv(data)
that is, addr0=offs.lsb (not bank.lsb) (I think I got that wrong in the doc).
The A3 command would be useful for finding memory mirrors and such things (but if the hardware happens to generate page faults on some addresses, then it might be difficult to use). Did you get it working?

The 32Kbyte stuff dumped from at A0000000 (command F4) should be RAM, like you said. Either it's containing a random pattern on power up (I think some chips do so), or the software has written that values in there.

Quote:
If d4s could link his SGB doc, there is some additional mirroring
differences between the SGB1 and SGB2 ICD2 chips.

What/where... the pdf doc, that mentions two ICD2-R versions (with/without company logo)? Oh, yes, that mentions some address lines ignored (I've spotted that doc some weeks ago, but missed that part).
The SGB2 should use a 3rd chip version, IC2D-N. And the two ICD2-R versions... I thought they were are both used in SGB1 (?)


Top
 Profile  
 
 Post subject:
PostPosted: Sun Feb 26, 2012 4:15 am 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1524
In that case, allow me to share the hardest fought and strangest behaviors I've encountered. They're documented on my forums, so I don't know if you've seen them before or not. Some of these took me weeks to figure out, so hopefully it'll save you some trouble.

CPU interrupts: obviously, there's a two-stage pipeline, and the effect is that it looks as though interrupts are tested one cycle before the instruction ends. But there's a strange anomaly: if this test succeeds and an IRQ is pending, then for any opcode that is exactly two cycles, and the second cycle is an I/O cycle (eg nop, clc, inc, asl, xce, etc.), that second I/O cycle transforms into a bus read of the PC address (without incrementing PC, of course.) If you're in a slow memory area, this will make the instruction take two cycles longer to execute.

CPU DMA synchronization: this is just psychopathic. The DMA runs on a 1/8 clock rate, so when DMA starts, the CPU has to wait N (where N > 0) cycles until it's aligned to the DMA clock. Then you have the DMA setup, then transfers, then it has to sync back to the CPU. That takes the last cycle's length (6, 8 or 12) and sleeps N cycles (where N > 0) until the total DMA time was an even multiple of that cycle's length (eg for 6; the total DMA time must be 6, 12, 18, 24 ...)

CPU ALU: I would highly suggest looking at my source for this one. These opcodes work over 8 or 16 CPU cycles to slowly compute the result. Reading them early returns partially computed values. The behavior becomes psychopathic if you try doing a DIV during an active MUL or vice versa.

HDMA MDR: so Speedy Gonzales tends to freeze in level 6-1. It turns out the game is bugged, and gets stuck in an infinite loop reading a non-existent register. The open bus value it reads will never satisfy the loop condition to exit. But it works on hardware. What happens is that HDMA continues to trigger, and eventually an HDMA will trigger right before that open bus read, and the HDMA read will update the MDR. This happens enough times, and eventually HDMA reads a value that can break out of the loop.

HDMA ordering: the expected order of operations used to be:
for(channel 0 to 7): transfer, update_indirect_address
But this is wrong. It is actually:
for(channel 0 to 7): transfer
for(channel 0 to 7): update_indirect_address
Doing it the latter way ensures that all HDMA transfers happen inside Hblank, even for the theoretical worst-case HDMA of eight channels and eight indirect address fetches.

HDMA early termination: this one's crazy. If the very last HDMA channel that terminates HDMA for the entire frame happens to perform an indirect memory address, it only performs a single fetch. So the indirect address becomes (low << 8), with the lower eight bits cleared.

PPU long dots: dots 322 and 326 last for 6 cycles instead of 4. Even more bizarre is that different models act differently. Sometimes it's dots 321 and 325; sometimes it's dots 323 and 327. Some systems it varies each time you test. This doesn't apply to the NTSC interlace field scanline 240 that is missing one dot. This also doesn't affect CPU IRQ timing, since the CPU has its own shadow H/V counter based off the PPU H/V signals being fed to it.

PPU VRAM writes: if you write to VRAM during the last possible cycle before writes are completely blocked and disabled, an odd thing happens: it writes the CPU MDR into VRAM instead of the actual bus value. Two cycles before that (since you can only step by 2), it writes normally.

CPU $2180: this reads WRAM in 6 clocks instead of 8 clocks. Even if you exploit the system to read from $2180 twice in a row, both reads are still valid. My guess is the CPU internally buffers the WRAM twice. Because if it could be read at 6 clocks, they would have made the system work that way everywhere. God knows the CPU needs all the speed it can get.

CPU differences: rev1 has a fixed position to trigger DRAM refresh. rev2 it alternates between two positions (uses another clock / 8 thing like DMA). rev1 and rev2 have a difference in the HDMA initialization start time computation. Obviously rev2 fixes the DMA<>HDMA conflict that crashes the rev1, but it's hard to emulate the rev1 crash since obviously, tests of it crash the system :P

SMP test register: stay far away from this one. You can control the CPU scalar for instructions, enable/disable RAM reading, RAM writing, MMIO register accesses, and do really weird things to the timers that we don't fully understand yet. This register is a rabbit hole that's broken me, anomie, and blargg.

Probably half of that is required for at least one game. The other half not so much.

Code:
send(A3h), recv(E0h)
send(addr0), recv(anything)
send(addr1), recv(anything)
send(addr2), recv(anything)
send(addr3), recv(anything)
repeat 128 times: send(80h..01h), recv(data)


Ah, I was doing it wrong, then. I was using:
send A3
send addr (little-endian order)
wait a while (for chip to get ready)
receive 128 times

On that note, my f3/f4 dumps were similar:
send F3/F4
wait a while (for chip to get ready)
receive 128K/32K bytes

I'll give this a shot if we need to know more about the memory map, thanks.

Quote:
The 32Kbyte stuff dumped from at A0000000 (command F4) should be RAM, like you said.


How do you know it's from a0000000? Just curious.
Given you know the commands too, I assume you found a debug mode that prints this info out onscreen?

Quote:
Both no$gba and VisualBoyAdvance have an ARMv4, and DS emulators have an ARMv5 core.


Both the MESS team and nocash have a huge head start on me, since I haven't written an ARM CPU emulator yet. I probably won't be the first to get this game running; unless they are both uninterested in it. But then this isn't a competition.


Top
 Profile  
 
 Post subject:
PostPosted: Sun Feb 26, 2012 6:57 am 
Offline

Joined: Fri Feb 24, 2012 12:09 pm
Posts: 1007
Quote:
allow me to share the hardest fought and strangest behaviors I've encountered

Igitt. I don't really like that effects (at least not when keeping emulation mind). But very interesting, thanks!

REFRESH is quite strange, too (see the "SNES Timing H/V Events" chapter). I haven't fully understood it what/why it's doing that.

Long dots. It's actually 4 long dots, 5 clks each (seen via oscilloscope) (but doesn't matter, to software it's always seen as 2 dots, 6 clks) (some more notes on purpose/glitches are in "SNES Timing H/V Counters" chapter).

Didn't knew that the long dots occur earlier/later on some models! Is there any relation to the version-number in STAT78 (213Fh)?

Btw. is there any list of chip versions (text seen on chip package) and corresponding 213Eh/213Fh/4210h values? Best something that covers all revisions up to latest costdown version.

Quote:
CPU $2180: this reads WRAM in 6 clocks instead of 8 clocks

Yeah, that's been puzzling me, too. I tried to mix reads & writes... and couldn't find any visible "prefetch-effects". Maybe it is doing some hidden internal prefetching, maybe not.
With my Xboo-upload circuit (see "Pinouts" chapter), I've been mirroring WRAM to ROM regions (including the "fast" area in bank 80h-FFh), and if I didn't got the test wrong, executing code in WRAM seemed to work fine even in fast mode - so Nintendo seems to HAVE made the access slower than necessary (though they might have had a reason... like older/slower chip revisions, or unstable effects at certain temperatures).

Quote:
SMP test register: stay far away from this one.

Too late to stay away. Most of it is quite straightforward (see "SNES APU SPC700 I/O Ports" chapter). But there may be some secrets left: The difference (if any) between the two timer bits isn't clear to me, don't know if the crash bit has any other use than crashing, the unstable wait-state settings are, well, unstable and/or confusing.
And the internal opcode timings might vary in some cases (for example, not tested if the RAM-like timings are based on dummy-reads from [PC], so maybe they change to I/O and ROM-like timings when executing code in ROM, or other such situations).

Quote:
How do you know it's from a0000000? Just curious.

A3, F3 and F4 commands are storing the address in [07C7]=bank, [07C9]=offs. A3 is sending the address to the ST018 chip. F3/F4 are just displaying the address in the headline of the hexdump screen (no idea if/how one is supposed to display that screen via joypad input).


Top
 Profile  
 
 Post subject:
PostPosted: Sun Feb 26, 2012 12:36 pm 
Offline

Joined: Mon Jul 14, 2008 4:02 pm
Posts: 88
nocash wrote:
For the SFC-Box, seeing a KROM1 dump would be great to solve most missing things, it's a (socketed) EPROM, so one could just unplug & dump it - if one does have the hardware at home.


There you go:
Super Famicom Box KROM1


Top
 Profile  
 
 Post subject:
PostPosted: Sun Feb 26, 2012 9:44 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 21595
Location: NE Indiana, USA (NTSC)
byuu wrote:
In that case, allow me to share the hardest fought and strangest behaviors I've encountered.

Crazy stuff. But I'd bet every hardware platform like this has its own share of ridiculouSNESs.

Quote:
for any opcode that is exactly two cycles, and the second cycle is an I/O cycle (eg nop, clc, inc, asl, xce, etc.), that second I/O cycle transforms into a bus read of the PC address (without incrementing PC, of course.) If you're in a slow memory area, this will make the instruction take two cycles longer to execute.

This behavior of always fetching the second byte of an instruction before the first byte is fully decoded is not so "strange"; it's unchanged from the original 6502.

Quote:
CPU DMA synchronization: this is just psychopathic. The DMA runs on a 1/8 clock rate

And on the NES it runs on 1/24: reading on 0 and writing on 12.

Quote:
HDMA MDR: so Speedy Gonzales [depends on DMA changing the open bus]

Which is only slightly more psychopathic (teen?) than Low G Man depending on open bus or Ms. Pac-Man and a couple other games depending on NMI latency.


Top
 Profile  
 
 Post subject:
PostPosted: Mon Feb 27, 2012 7:42 am 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1524
Quote:
And on the NES it runs on 1/24: reading on 0 and writing on 12.


Speaking of this, nocash ... this is the trickiest part for me.

So if you read $2137, it latches the counters. But if instead of lda $2137, you instead have sta $4201 to latch the counters, then the HV counter value is four clock cycles later (eg HDOT is +1 higher.)

After reading the WDC manual's painful timing section, and discussing with TRAC, the best we could come up with was this:

Cycles are 6 (fast), 8 (slow), 12 (serial) against the 315/88*6m clock (21.47MHz.)
Reads are acknowledged at time-4 {2,4,8}.
Writes are acknowledges at time-0 {6,8,12}.
(Internally, of course, the data is there for most of the cycle, it remains for several clocks. Chips can respond to reads faster, but the other chips wait to make sure it's there.)

What I worry about is that this may not be exactly correct. Eg there may be different propagation delays based on which chips are talking to which. The problem is that I have no way of verifying that.

Without this absolutely vital information, it becomes near impossible to say with certainty that eg the PPU caches BG1HOFS at cycle # (86, for a made up example.) It's highly dependent on us having the right propagation delays.

Do you have any added theories on this subject?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 66 posts ]  Go to page 1, 2, 3, 4, 5  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group