I would say that CHR banking is probably a waste in this case, but CHR-RAM by itself is already very versatile anyway
Was considering how you'd need to get down to 512 byte banking before traditional contiguous CHR banks become useful again
So I guess I'd say 1-screen but only if you can bank CHR to make use of a 32k RAM.
Sorry, not sure I understand these points about banking and 512 banks. But I probably didn't summarize my thoughts very clearly either
So here's a summary of how my imagined configurations would work:
H/V mirroring with 32kB CHR: The fine X/Y attribute bits directly drive the 2-bit CHR bank page, both because that makes the logic simple, but also because you amortise the wasted CHR over all four 8kB pages.
And it is not 1/8th of your available CHR... rather, each of the four nametables would use 4 tiles of CHR from each of the switchable 8kB banks. This means 4*4 = 16 tiles "stolen" from each 8kB CHR, which you should be able to cut in half if you add gates to take advantage of mirroring mode.
The developer would also need to switch the 8kB bank when updating the attributes.
1-screen mirroring with 32kB CHR: In this case, we actually fit the extended attribute table to the second screen, whilst leaving some nametable space there as well. Because we have enough memory in CIRAM we wouldn't have to map the extended attribute table to CHR.
The extra gates needed wouldn't be free, of course. So it might drive costs up a bit compared to the H/V mirroring. But I'm thinking the neater solution might make it worth it. I think I'll try to sketch out how much
Another benefit is that this 1-screen solution could work the same with CHR-ROM too, whereas the two other solutions are CHR-RAM only.
4-screen: With infiniteneslives mapper30 variant, we already sacrifice one of four 8kB CHR banks for use as 4kB 4-screen memory. Again, I think it makes sense to put the extra gates in to map all the attributes to 1kB of this unused 4kB area, to avoid the attributes needing to steal any tiles from the remaining CHR pages.
Of course, we don't need to be limited to mapper30. I'm only using it as an example of a low-cost implementation for a popular homebrew mapper. Plus starting to consider the entire mapper library easily goes off into too many tangents again. Mapper30 also has a few variants already, which all make some interesting trade-offs one could build on.