It is currently Thu Jun 21, 2018 3:10 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 43 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
PostPosted: Tue Apr 10, 2018 3:57 pm 
Offline

Joined: Wed Mar 09, 2005 9:08 am
Posts: 410
lidnariq wrote:
Bananmos wrote:
* Which of the possible configurations would be your favorite? Conventional H/V mirroring trading away CHR tiles? One-screen mirroring leaving CHR alone? Or variants on the 4-screen configuration?
My hunch is that once one has 8x8 attributes, 1-screen layout becomes awfully reasonable. You get the benefits of run-time switchable nametables (like GTROM or AxROM) without really having to worry about attribute clash regardless of scrolling axis.


Yeah, I'm inclined to agree. The slightly annoying thing with 1-screen mirroring though, is that it makes certain things like big background bosses a bit more of a hassle to deal with, when you don't have any spare nametable to scroll to horizontally with no CPU cost. And also that updating an 8x8 tile at most is a little bit less convenient than being able to directly write 16x16 or even 32x32 metatiles.

But those are probably minor trade-offs in the grand scheme of things. So I'm leaning towards 1-screen too.


Top
 Profile  
 
PostPosted: Tue Apr 10, 2018 4:12 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6347
Location: Canada
edit: this post was predicated on an incorrect understanding, probably not very meaningful as a result.

Bananmos wrote:
across all 8kB CHR-RAM pages.

I would say that CHR banking is probably a waste in this case, but CHR-RAM by itself is already very versatile anyway, so maybe the lack of banking isn't a big deal. (...but we might as well be mapper 2 here.)

Was considering how you'd need to get down to 512 byte banking before traditional contiguous CHR banks become useful again, but maybe if you swapped bit 9 you could do a bank size as coarse as 4k but interleaved at 512 bytes? In most cases of bankswitching you have some "fixed" banks and some busy ones, so only being able to bankswitch 50% of the CHR region is probably just as good as all of it, since it being CHR-RAM still lets you do a "slow" switch of the fixed quarter.

Bananmos wrote:
Do you even feel 8x8 attributes are worth the extra hassle, with the inconvenient memory layout and all?

I think regular attributes already have a super inconvenient memory layout, so this is no big change. The only real inconvenience IMO is losing 1/8 of available CHR.

Bananmos wrote:
Which of the possible configurations would be your favorite? Conventional H/V mirroring trading away CHR tiles? One-screen mirroring leaving CHR alone? Or variants on the 4-screen configuration?

Hmm, I had been thinking 4-screen as a given but with 8x8 attributes maybe that doesn't have so much of an advantage anymore. Especially since you could remove the imposition on half your CHR (at which point contiguous 4k banking for the untouched half is pretty viable again...).

So I guess I'd say 1-screen but only if you can bank CHR to make use of a 32k RAM. If we can't fit banking in, then 4-screen seems more sensible by default? (Might as well get some use out of the extra RAM, is what I mean.)

..but really all the mirroring variations are probably useful to somebody. Depends on your game's goals.


Last edited by rainwarrior on Tue Apr 10, 2018 5:17 pm, edited 2 times in total.

Top
 Profile  
 
PostPosted: Tue Apr 10, 2018 4:53 pm 
Offline

Joined: Wed Mar 09, 2005 9:08 am
Posts: 410
Quote:
I would say that CHR banking is probably a waste in this case, but CHR-RAM by itself is already very versatile anyway

Quote:
Was considering how you'd need to get down to 512 byte banking before traditional contiguous CHR banks become useful again

Quote:
So I guess I'd say 1-screen but only if you can bank CHR to make use of a 32k RAM.


Sorry, not sure I understand these points about banking and 512 banks. But I probably didn't summarize my thoughts very clearly either :)

So here's a summary of how my imagined configurations would work:

H/V mirroring with 32kB CHR: The fine X/Y attribute bits directly drive the 2-bit CHR bank page, both because that makes the logic simple, but also because you amortise the wasted CHR over all four 8kB pages.

And it is not 1/8th of your available CHR... rather, each of the four nametables would use 4 tiles of CHR from each of the switchable 8kB banks. This means 4*4 = 16 tiles "stolen" from each 8kB CHR, which you should be able to cut in half if you add gates to take advantage of mirroring mode.

The developer would also need to switch the 8kB bank when updating the attributes.

1-screen mirroring with 32kB CHR: In this case, we actually fit the extended attribute table to the second screen, whilst leaving some nametable space there as well. Because we have enough memory in CIRAM we wouldn't have to map the extended attribute table to CHR.

The extra gates needed wouldn't be free, of course. So it might drive costs up a bit compared to the H/V mirroring. But I'm thinking the neater solution might make it worth it. I think I'll try to sketch out how much

Another benefit is that this 1-screen solution could work the same with CHR-ROM too, whereas the two other solutions are CHR-RAM only.

4-screen: With infiniteneslives mapper30 variant, we already sacrifice one of four 8kB CHR banks for use as 4kB 4-screen memory. Again, I think it makes sense to put the extra gates in to map all the attributes to 1kB of this unused 4kB area, to avoid the attributes needing to steal any tiles from the remaining CHR pages.

Of course, we don't need to be limited to mapper30. I'm only using it as an example of a low-cost implementation for a popular homebrew mapper. Plus starting to consider the entire mapper library easily goes off into too many tangents again. Mapper30 also has a few variants already, which all make some interesting trade-offs one could build on.


Top
 Profile  
 
PostPosted: Tue Apr 10, 2018 5:13 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6347
Location: Canada
Bananmos wrote:
Sorry, not sure I understand these points about banking and 512 banks. But I probably didn't summarize my thoughts very clearly either :)

Sorry, my mental concept of it was muddled. You can ignore that idea then, and I'll sort out my understanding. The additional description is helpful.


Top
 Profile  
 
PostPosted: Tue Apr 10, 2018 8:16 pm 
Offline
User avatar

Joined: Wed Sep 07, 2005 9:55 am
Posts: 319
Location: Phoenix, AZ
Bananmos wrote:
* Which of the possible configurations would be your favorite? Conventional H/V mirroring trading away CHR tiles? One-screen mirroring leaving CHR alone? Or variants on the 4-screen configuration?


I'm a big fan of one-screen mirroring. I like to used nat-a for the playing field and nat-b for the status bar, textboxes, and other effects.


Quote:
And also that updating an 8x8 tile at most is a little bit less convenient than being able to directly write 16x16 or even 32x32 metatiles.


I came across the same problem and ended up adding a data port that has two modes of operation. The first just writes the 2-bits to the set destination in the attribute table. The second takes one 8-bit write and splits them into four writes to a 2x2 tile region. Something like this would probably push you over your budget though.


Top
 Profile  
 
PostPosted: Tue Apr 10, 2018 8:29 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6347
Location: Canada
Bananmos wrote:
1-screen mirroring with 32kB CHR: In this case, we actually fit the extended attribute table to the second screen, whilst leaving some nametable space there as well. Because we have enough memory in CIRAM we wouldn't have to map the extended attribute table to CHR.

Another benefit is that this 1-screen solution could work the same with CHR-ROM too, whereas the two other solutions are CHR-RAM only.

So by 1-screen you mean only 1 screen, not "1 of 2 screens at a time", which is the usual terminology?


Top
 Profile  
 
PostPosted: Tue Apr 10, 2018 9:14 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7222
Location: Seattle
I think it might be the variant where there are two nametables, switchable at runtime, but they share a single attribute table. (One attribute table specifies top 16x8 or left 8x16; the other the other half)

I don't think it's really an ok compromise, sadly. Those 8x16 or 16x8 attribute zones might work better with H/V layout of the nametables instead.


Top
 Profile  
 
PostPosted: Wed Apr 11, 2018 7:54 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6347
Location: Canada
Is this a correct summary of the logic involved? (for 4-screen nametables at CHR $C000-FFFF, otherwise unbankable CHR tiles at $0000-1FFF)
Code:
NMT = PPU A13 ($2000)
ATT = PPU A6-9 and ($3C0)
NY = PPU A5 ($20)
NX = PPU A0 ($1)

Latch input 0 = NX
Latch input 1 = NY
Latch write = NMT and not ATT
LX = Latch output 0
LY = Latch output 1

CHR A14 ($4000) = NMT and (not ATT or LY)
CHR A13 ($2000) = NMT and (not ATT or LX)
CHR A0-12 = PPU A0-12

Though... I suppose this also makes the latch hard to control for setting up the attribute writes? Would I need to make two extra writes to $2006 to set the latch? (Or have the CPU interface also write to the latch?)

Edit: Also the latch write signal might need to be inverted, depending on how the specific latch is controlled. lidnariq suggested it might also need to be gated by PPU RD so that it only latches when the address bus is known to be stable.

CHR A13/14 could also be inverted if it's convenient. (Doesn't matter where things end up as long as they're distinct.)


Last edited by rainwarrior on Wed Apr 11, 2018 11:10 pm, edited 2 times in total.

Top
 Profile  
 
PostPosted: Wed Apr 11, 2018 8:15 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7222
Location: Seattle
rainwarrior wrote:
LW = Latch write = NMT and not ATT
Should probably include PPU /RD also to make sure it's not accidentally triggered

Quote:
CHR A14 ($4000) = NMT and (not ATT or LY)
CHR A13 ($2000) = NMT and (not ATT or LX)
... equivalently, (NMT and not ATT) or (NMT and LY)? So it's always the last bank ($6000-$7FFF) on a nametable fetch, is bank #<LY,LX> if it's an attribute table fetch, and 0 otherwise?

Yes, I think that does what you think it should.

I'd be tempted to add "or (not NMT and MapperQ0)" to make banking more useful. (Equivalently: MUX4(<NMT,ATT>, <Q1,Q0>, <Q1,Q0>, <1,1>, <LY,LX>) )

Quote:
Though... I suppose this also makes the latch hard to control for setting up the attribute writes? Would I need to make a dummy read from a nametable to set the latch?
Yeah, vaguely. You would. Not so different from mapper 96.


Top
 Profile  
 
PostPosted: Wed Apr 11, 2018 8:51 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6347
Location: Canada
lidnariq wrote:
Should probably include PPU /RD also to make sure it's not accidentally triggered

I'm wondering if that's important or not. Writing to $0000-1FFF it wouldn't matter. Writing to $2000-2FFF it will go to the nametable if not in the attribute region.

The only area subject to accidents is the attribute region, isn't it? The PPU will always fetch nametable before attribute, so the only thing left is when you're writing it, I think? Doesn't seem like a problem to me, but I might be missing something?

So, trying to consider what the "worst" case is, I guess is trying to update a 2-screen tall column of tiles. I think the big drawback here is just that updating attributes vertically was already a bandwidth hog because of skipping around, and we're doubling that. Counting writes to $2007 + $2006 (+ attribute latch select $2006):
Code:
32 + 2 nametable 0
32 + 2 nametable 1
8 + 16 + 2 attribute 0 y0
8 + 16 + 2 attribute 0 y1
8 + 16 + 2 attribute 1 y0
8 + 16 + 2 attribute 1 y1
= 172 bytes

A lot for a single frame, but maybe doable. Very feasible if spread over 2 frames?

lidnariq wrote:
Quote:
CHR A14 ($4000) = NMT and (not ATT or LY)
CHR A13 ($2000) = NMT and (not ATT or LX)
... equivalently, (NMT and not ATT) or (NMT and LY)?

Yes, equivalent to that. Wasn't sure how to best express these lines.


Top
 Profile  
 
PostPosted: Wed Apr 11, 2018 9:29 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7222
Location: Seattle
rainwarrior wrote:
I'm wondering if that's important or not. Writing to $0000-1FFF it wouldn't matter. Writing to $2000-2FFF it will go to the nametable if not in the attribute region.
It's about whether the values on the bus are valid in a continuous sense, or whether we have to wait for the PPU/RD signal (for +ALE to be false) before its contents are trustworthy.

Otherwise there could be crosstalk between the address one cares about and the one after.

It might work without caring about PPU/RD, I don't know. The Oeka Kids mapper assumes that PPU A9 and A8 are not more than one gate delay later than PPU A12 and A13, which is fair, because that's using a register and only cares about when the address bus becomes correct.

On the other hand, PPU A0-A7 have the extra delay of going through the 74'373 latch. They'll be some small number of tens of nanoseconds delayed after the upper 6 bits of the address bus.

Quote:
lidnariq wrote:
... equivalently, (NMT and not ATT) or (NMT and LY)?
Yes, equivalent to that. Wasn't sure how to best express these lines.
I think any of these are equally valid. I just personally needed to expand the equation to reason through how it worked.


Top
 Profile  
 
PostPosted: Wed Apr 11, 2018 9:38 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6347
Location: Canada
Ah, that makes sense.

I guess PPU RD and WR are the only "stable address" timing signifiers we have? My estimate above would need another 4 bytes of $2007 reads to trigger the latch, but that's probably fine.


Top
 Profile  
 
PostPosted: Thu Apr 12, 2018 2:36 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7222
Location: Seattle
rainwarrior wrote:
I guess PPU RD and WR are the only "stable address" timing signifiers we have?
Could also "just" delay a certain amount after PPU A13 rises. Not clear how long it needs to be; probably 50-100ns is fine? +ALE is high for one half pixel, or 94ns, so it should have propagated through the latch within about ~30ns and is definitely not later than 120ns. Using something like the Oeka Kids latch would be nice, except that we still have to generate ATT.

Quote:
My estimate above would need another 4 bytes of $2007 reads to trigger the latch, but that's probably fine.
That's the advantage of retaining CHR-RAM banking in combination with 4-screen layout and 8x8 attributes: you could just upload the attribute table via the CHR addresses without needing to trigger bankswitching via nametable reads.


Top
 Profile  
 
PostPosted: Thu Apr 12, 2018 4:33 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6347
Location: Canada
True, but I'd be more excited just about the CHR banking itself. ;)


Top
 Profile  
 
PostPosted: Wed Apr 18, 2018 5:06 pm 
Offline

Joined: Wed Mar 09, 2005 9:08 am
Posts: 410
Quote:
I'd have to double check, but I'm pretty sure I had to latch bits on the rising edge for the PowerPak.


You're right, I forgot that the Everdrive I/O actually inverts a lot of the signals. I tried doing the same stuff on the powerpak, and it indeed glitches the same when latching on the falling edge of /PPU_OE, and is glitch-free when latching on the rising edge.

Here's the Powerpak experiment:
Attachment:
map30.v [4.11 KiB]
Downloaded 22 times


Quote:
I think it might be the variant where there are two nametables, switchable at runtime, but they share a single attribute table. (One attribute table specifies top 16x8 or left 8x16; the other the other half)

I don't think it's really an ok compromise, sadly. Those 8x16 or 16x8 attribute zones might work better with H/V layout of the nametables instead.


I was thinking more of the following layout in CIRAM, roughly:
- nametable#1 contains full nametable tiles, and uses bytes 768-1023 of nametable#2 for its attribute table. It's original 64 byte attribute table is unused.
- nametable#2 only has space for 512 bytes (16 rows) of nametable tiles, and uses bytes 512-767 for its attribute table. (although only bytes 512-639 would actually correspond to valid nametable tiles)

The hypothetical configuration above has some waste in nametable#2, to make the logic simpler. But it'll still be a bit more complex than the simple CHR-steal variant. The idea would be that you'd rarely use nametable#2 for much other than a non-scrolling status bar. Or alternatively, you could be lazy and limit vertical scrolling to a height of 32+16 = 48 tiles, and only do nametable updates horizontally...

I hope to get some time to test this idea out. But on the Powerpak/Everdrive I'd have to fake it by switching address lines around, and limit CHR to 8kB CHR-RAM, because there's no way for these flash carts to override CHR address pins A9 and A8. And I should probably make a test ROM to verify it's actually working as expected.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 43 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: Google Feedfetcher and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group