It is currently Mon Oct 16, 2017 5:45 pm

All times are UTC - 7 hours



Forum rules


Related:



Post new topic Reply to topic  [ 41 posts ]  Go to page Previous  1, 2, 3
Author Message
PostPosted: Wed Aug 27, 2014 1:18 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 781
ARM9 wrote:
8bpp can be used for both mode7 and mode3/4, it depends on how you upload it to vram (port $2118 byte transfers interleaved or word to 2118/2119) and how you build your map.

You sure? How would you account for packed-pixel vs. bitplane?

Quote:
Quote:
Well, technically the GSU is wired directly to the cartridge (it could even be wired to access parts not accessible to the 65816), so in the worst case they could just wire the banks to the relevant portions... The only limitation here would be Nintendo's policies =P
The gsu sits between the cartridge rom/ram and the scpu so the address bus on the gsu is the limit here, which can only address 2MiB on all but the first version. Shouldn't be too much of a hassle to increase that on something like the sd2snes.

I'm referring to the memory maps in the manual. The GSU doesn't see anything above bank 71, but the S-CPU can access a bunch of stuff past that point, including 2 MB of LoROM from 80 to BF and 4 MB of HiROM from C0 to FF. This 6 MB ROM region is in addition to what the GSU can access, and according to the diagrams in sections 1.3 and 1.4, the CPU can access this ROM irrespective of what the GSU is doing or the status of the access control switch.


Top
 Profile  
 
PostPosted: Wed Aug 27, 2014 2:05 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
Re: 8bpp available in mode 3, mode 4, and mode 7: this is correct.


Top
 Profile  
 
PostPosted: Thu Aug 28, 2014 4:46 pm 
Offline

Joined: Mon Mar 27, 2006 5:23 pm
Posts: 1338
I don't care what the manual says, the layout is:

[SNES cartridge connector] <-> [GSU] <-> [ROM]

Only one of them can actually read back valid ROM contents at a time, because you can't have two chips reading the same chip at different locations at the exact same time. It's not physically possible.

The SA-1 is the only coprocessor that appears to do it, but in fact it uses another logic block that controls memory accesses. It will actually stall the SA-1 CPU when the host SNES CPU is accessing the same chip at the same time. Which as you can imagine, results in the code taking longer to execute. The GSU does not have this logic, and neither does the Cx4.

Now ... if you wire up your own cart, you can easily have:

[SNES cartridge connector] <-> [GSU] <-> [ROM1]
[SNES cartridge connector] <-> [ROM2]

Where obviously the GSU won't be able to access ROM2, but the SNES CPU can continue to use ROM2 while the GSU is using ROM1.

Now, what is the max ROM the GSU can address? I'd probably go with what the docs say, depending on each revision. But that's strictly a matter of how many ROM address pins there are on the GSU chip itself.

Cheating this with the sd2snes won't do you much good, since unfortunately that chip's not emulated there (yet? may prove too demanding for the FPGA used.)


Last edited by byuu on Thu Sep 11, 2014 8:57 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Thu Aug 28, 2014 7:14 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19083
Location: NE Indiana, USA (NTSC)
byuu wrote:
I don't care what the manual says, the layout is:

[SNES cartridge connector] <-> [GSU] <-> [ROM]
[...]
Now ... if you wire up your own cart, you can easily have:

[SNES cartridge connector] <-> [GSU] <-> [ROM1]
[SNES cartridge connector] <-> [ROM2]

I have avoided seeing the manual. But based on what's been said so far in this topic, it appears the manual mentions the latter configuration, which never ended up used in a commercial game due to cost.


Top
 Profile  
 
PostPosted: Thu Aug 28, 2014 10:26 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 781
koitsu wrote:
Re: 8bpp available in mode 3, mode 4, and mode 7: this is correct.

For the SNES, or for the Super FX?


Top
 Profile  
 
PostPosted: Thu Aug 28, 2014 10:42 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
93143 wrote:
koitsu wrote:
Re: 8bpp available in mode 3, mode 4, and mode 7: this is correct.

For the SNES, or for the Super FX?

SNES. I know absolutely *jack squat* about the Super FX or any extension chips.


Top
 Profile  
 
PostPosted: Thu Aug 28, 2014 11:43 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 781
Okay, found a reference that isn't the dev manual (not as explicit about the circuit layout, unfortunately):

viewtopic.php?f=12&t=5964&hilit=Additional&start=45#p103957

nocash wrote:
GSU Memory Map (at SNES Side)
This is more or less as already known. The 8K at xx:6000h-xx:7FFFh is always mirroring to 700000h-701FFFh (no matter if the "xx" bank is 00h..3Fh or 80h..BFh).
Code:
  00-3F:3000-34FF  GSU I/O Ports
  00-3F:6000-7FFF  Mirror of 70:0000-1FFF (ie. FIRST 8K of Game Pak RAM)
  00-3F:8000-FFFF  Game Pak ROM in LoRom mapping (2Mbyte max)
  40-5F:0000-FFFF  Game Pak ROM in HiRom mapping (mirror of above 2Mbyte)
  70-71:0000-FFFF  Game Pak RAM       (128Kbyte max, usually 32K or 64K)
  78-79:0000-FFFF  Additional "Backup" RAM  (128Kbyte max, usually none)
  80-BF:3000-32FF  Mirror of GSU I/O Ports
  80-BF:6000-7FFF  Mirror of 70:0000-1FFF (ie. FIRST 8K of Game Pak RAM)
  80-BF:8000-FFFF  Additional "CPU" ROM LoROM (2Mbyte max, usually none)
  C0-FF:0000-FFFF  Additional "CPU" ROM HiROM (4Mbyte max, usually none)
  Other Addresses  Open Bus

The above "Additional" areas aren't installed on existing boards (=are seen as open bus).


tepples wrote:
byuu wrote:
I don't care what the manual says, the layout is:

[SNES cartridge connector] <-> [GSU] <-> [ROM]
[...]
Now ... if you wire up your own cart, you can easily have:

[SNES cartridge connector] <-> [GSU] <-> [ROM1]
[SNES cartridge connector] <-> [ROM2]

I have avoided seeing the manual. But based on what's been said so far in this topic, it appears the manual mentions the latter configuration, which never ended up used in a commercial game due to cost.

As far as I can tell, that's exactly right. The SNES is supposed to be wired straight into the "additional" ROM and RAM, parallel to the GSU and the memory behind it. But no games actually did this...


Top
 Profile  
 
PostPosted: Mon Sep 01, 2014 1:44 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 781
If it's not too late, I'd just like to reiterate that if anyone on here knows the answers to my questions, I do not want them put to an original Star Fox programmer who is reportedly very busy and might, given the time elapsed, have to look up detailed chip information like anyone else. For instance, in light of nocash's old post and byuu's newer one, I consider my question #4 answered.

Also, it turns out nocash has enough data in his fullsnes document that I don't need to reference the manual for my questions.

Revised list:

1) What are the absolute hardware bottlenecks on blitting (using PLOT with color #0 not written, or only PLOTting part of a pixel cache, so it has to read the old data from RAM before writing the new data back)?
1b) How many cycles does it take to empty the secondary pixel cache under those circumstances?
1c) How about transferring the primary cache to the secondary, once the secondary is free?

2) Apparently ROM access in high speed mode (21 MHz) is 5 cycles instead of 3. Is the same true of RAM access? For both reading and writing? And does this impact the answer(s) for (1)? Did this change at all between chip/board revisions?

3) Is the instruction cache on the latest version(s) of the GSU 256 bytes or 512 bytes? I'd like to be sure.


Top
 Profile  
 
PostPosted: Wed Sep 03, 2014 3:42 am 
Offline

Joined: Sun Aug 11, 2013 6:07 am
Posts: 57
93143 wrote:
2) Apparently ROM access in high speed mode (21 MHz) is 5 cycles instead of 3. Is the same true of RAM access? For both reading and writing? And does this impact the answer(s) for (1)? Did this change at all between chip/board revisions?

Since the RAM access is documented to be similar to ROM in most cases (other than where executing in RAM would impact RAM access) I'd think fullsnes is correct on this point.
Storing to ram (sm,st,sbk) uses a buffer so the cpu can continue executing opcodes without having to wait (except when running code in ram). If you execute other code while ram is being written you can perform 1-2 cycle writes (when running in cache). This is all documented in the pdf that I'll assume you have, you should read through the gsu chapter.
It's a bit inconsistent and just plain wrong at times, but it's the best we have at this point until somebody finds argonaut documents.
>According to cache description (page 132), Cache-Code is 6 times faster than ROM/RAM. However, according to opcode descriptions (page 160 and up), cache is only 3 times faster than ROM/RAM. Whereas, maybe 6 times refers to 21MHz mode, and 3 times to 10MHz mode?
93143 wrote:
3) Is the instruction cache on the latest version(s) of the GSU 256 bytes or 512 bytes? I'd like to be sure.

512 bytes, all revisions, it's in the manual, fullsnes and bsnes. And you can test it yourself with $3100-$32FF. Where'd you read that it's 256 bytes?

As for question 1, I'm curious about this as well, it's not documented in the manual other than plot having a worst case of 48 cycles. Generally you want to put as much general processing as possible between plot and load/store instructions. Considering the worst case it might be wise to try and put more code after a plot until you access ram.
If you want exact timings, consider profiling on hardware.


Top
 Profile  
 
PostPosted: Thu Sep 04, 2014 1:39 am 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 781
ARM9 wrote:
93143 wrote:
2) Apparently ROM access in high speed mode (21 MHz) is 5 cycles instead of 3. Is the same true of RAM access? For both reading and writing? And does this impact the answer(s) for (1)? Did this change at all between chip/board revisions?

Since the RAM access is documented to be similar to ROM in most cases (other than where executing in RAM would impact RAM access) I'd think fullsnes is correct on this point.

I'd think so too, but it doesn't seem to be too definite on the subject, what with all the question marks and the caveat about poor documentation...

Quote:
Storing to ram (sm,st,sbk) uses a buffer so the cpu can continue executing opcodes without having to wait (except when running code in ram). If you execute other code while ram is being written you can perform 1-2 cycle writes (when running in cache).

Yeah, but that doesn't change the fundamental fact that the throughput to RAM is one byte every X cycles, which would bottleneck a sufficiently lean continuous write loop.

According to my calculations, the application I have in mind (a port of a bullet hell shooter) is pretty much right on the edge of the chip's capabilities. The difference between 24 and 40 cycles for a 4bpp cache flush with unset bit-pend flags could be the difference between being able to exactly duplicate the original bullet patterns and having to simplify them.

I do not want to have to simplify the patterns, because that probably means rebalancing the game, which I don't trust myself to do.

I suppose I could leave the chip in low-speed mode and overclock it, but that's cheating (good luck getting Nintendo to agree to let you do that for a commercial release), and might result in errors with the memory used in the original games...

Quote:
93143 wrote:
3) Is the instruction cache on the latest version(s) of the GSU 256 bytes or 512 bytes? I'd like to be sure.

512 bytes, all revisions, it's in the manual, fullsnes and bsnes. And you can test it yourself with $3100-$32FF. Where'd you read that it's 256 bytes?

Well, byuu has used the number a few times. I figured there had to be a reason...

Quote:
If you want exact timings, consider profiling on hardware.

I guess that would be ideal, but I don't really have the resources (or skills) to do that right now. Ultimately I may well end up running on a real GSU, but I'd rather not have to choose between doing that up front (stalling the whole project until I can get the time and resources together) and potentially getting a nasty surprise after writing a ton of code...

I suppose I could just assume higan is close enough and test it there, but byuu has complained about Super FX timing in the past and I don't know if the current GSU code is as accurate as the core system emulation...


Top
 Profile  
 
PostPosted: Sat Dec 03, 2016 4:43 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 781
Been a while since anyone posted in this thread. I've actually been worried that I hijacked the thread and prevented the original questions from being answered... Is the opportunity still open? Did I miss the resolution?

I was going to take this opportunity ask the forum experts a question about FROM/TO/WITH, but it turns out the answer was RTFM, so... bump, I guess.

I here reproduce the list of questions, with one of mine deleted because it's been answered. (My remaining questions are basically just a more detailed version of psycopathicteen's question, possibly too detailed for the amount of time that's passed (byuu et al. may be a better source), and honestly we already know what the answers probably are, so I don't consider them high priority...)

psycopathicteen wrote:
1)How does the SuperFX compare against the DMA at filling pixels?

Sik wrote:
1) What algorithms are used to process the vertices? Both transformation and projection.

2) What algorithm is used to raster (render) the triangles?

3) [split, trim] Related, is there any special calculation [in Starfox/the SuperFX?] to discard backfacing triangles?

4)[trim] What were the biggest bottlenecks when programming [with] the SuperFX?


whicker wrote:
1)What was the development process like?

2) [paraphrase]Did you debug on a PC or on the SNES? If on the SNES, how?

3) [trim] Did/Does the SuperFX CPU itself have any sort of debugging features?

4) [trim] Although I realize you were working on the software, but do you recall any discussions about why the SuperFX boards had to start using a dedicated clock resonator circuit instead of the 21 MHz signal from the cartridge edge?

ARM9 wrote:
1)[paraphrase, trim]How did they handle interoperability between the scpu and gsu; how exactly did they split the tasks between the two processors and which one did what?

ccovell wrote:
1) What I'd love is a timeline about the whole Argonaut project.

2) I'd love any info about projects, both successful and cancelled. :-)

93143 wrote:
1) What are the absolute hardware bottlenecks on blitting (using PLOT with color #0 not written, or only PLOTting part of a pixel cache, so it has to read the old data from RAM before writing the new data back)?
1b) How many cycles does it take to empty the secondary pixel cache under those circumstances?
1c) How about transferring the primary cache to the secondary, once the secondary is free?

2) Apparently ROM access in high speed mode (21 MHz) is 5 cycles instead of 3. Is the same true of RAM access? For both reading and writing? And does this impact the answer(s) for (1)? Did this change at all between chip/board revisions?


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 41 posts ]  Go to page Previous  1, 2, 3

All times are UTC - 7 hours


Who is online

Users browsing this forum: AWJ, Google [Bot] and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group