First, i'd like to clarify that anything written in the OP was just to get the thread started, not a locked design proposal.
Regrettably, i won't have access to my laptop for a few days. A few mockup animations of whole screens would greatly help visualize what you could do with two or three PPU:s layered on top of each other. I'll return to that if there's at least some interest.
It also makes quoting specific subtopics a pain from the phone, so bear with me. Anyway:
InfiniteNESlives, you may very well be right that a standalone console might not be the right end form. But regarding the use of two/three layered PPU:s (which is really the core feature in my otherwise conservative proposition in the OP), i have to disagree strongly. For example, assume a scrolling walking left and right type of game. On a per level section basis, what's significantly more complicated, project time enlarging, or unwieldy uploading a fixed scenery to the "background layer" PPU and let the rest behave as usual? The amount of extra work is minimal code-side, while massively altering the visual experience. While a graphics hobbyist like me would love to see what i can do with that feature alone, the task of generating graphics can also be as simple as layering a set of graphics you would've drawn anyway, like for example an arcade port of
Moon Patrol. In this instance, parallax scrolling would actually become less time consuming to program, assuming NES won't have the parallax effect and "+" addition would.
Ultimately, the choice is up to the designer/dev: Use the 2/3 PPU:s (and CPU power) to make the project more manageable/less time consuming, or make the same effort to achieve more things, or in the rare occasion, ramp up projects to make something really shimmering. We're also seeing more and more colaborations/division of labour along with tools/engines, which enables more people to participate in the homebrew scene.
Furthermore, if the layer combiner unit would support programmable PPU ordering, or even a byte's worth of steps of opacity and blend modes on top of the already present emphasis bits that would now be on a per PPU level; a deep well of creative experimentation would be just one register write away.
With one of the two WDS CPU:s as a base, it could also become easier to make music work. You could simply write music however you want in FamiTracker, focusing on the expression rather than the art of compromise (and spend time on team discussions on the same topic - which may be part of the fun but adds to the project scope all the same). Now there's the option to skip the art of compromise - if you want to focus on other things. Both options are there.
MMC3, by comparison, was popular due to commercial interest. It's only natural that it wouldn't have the same post-market value outside reproduction and hacks. It just doesn't offer features that are interesting or convenient enough. By comparison; GTROM offers more interesting and accesible features for a lower price - for example the way 4-screens and their bankswitching enables me to write less complicated code to achieve nice looking scrolling and status bar/map features makes it much more attractive to sink my teeth into. It lets me do things i want to do without the relatively messy interface of MMC3. When planning out a project or just doodling code for the purpose of learning, i do it with either NROM, UNROM or GTROM as a template. Will it have scrolling? Then i go to GTROM directly.
I think it is important that a hypothetical console (or expansion) leaves room to grow organically into over time. But it isn't really comparable to most of the commercial-era mappers, because most of their feature sets aren't that attractive to grow into. Music chips aside.
I think
Gradual Games is right that the first step should be something akin to a "fantasy console". But as for the choice of programming language, to each their own. Before reading every article and pdf i could find on 6502 assembly, all i could program was qbasic (and similar languages), terribly aged webdev, and game maker scripts. This makes me a programming novice at best, but i found the 65xx assembly terribly
fun. Every little personal success is felt, somehow, directly*. Now, with a 65816 or the like, you can write the same sort of assembly, but more comfortably so - and you can also afford yourself to make efficiency mistakes more liberally while learning/exploring the platform. Or, if you prefer C or a C-like language because that's the direction you came from, that would mean your compiler doesn't have to be as efficient as for the 6502 and close derivates. You could write a FORTH or BASIC-like compiler or interpreter just aswell; as has been done for vanilla NES.
Basically, you would have more options viable to suit your programming preferences.
*Granted, i can whip up a simple NES-like single screen platformer or shoot em up in under an hour in something modern like game maker, which provides a plate to build the burger on (as does pico8 to some extent if i've understood it right). But there's little lure doing so on in game maker. It'd just be shovelware, most likely/something PC/mobile has more than enough of.
But back to the fantasy console thing. That's what this basically would be, for a long time. Meaning it would only exist as a console emulation in its first phase. As soon as you have emulation, nothing's to stop an individual from disabling desktop on an RPi and autostart emulation on power on, to approximate a console experience. More conveniently still if distributed as an image. That'd potentially gather interest from the RPi tinkerer and leisure user crowd if well presented. An actual, functioning console/full expansion release would likely be the final step and a luxury item with a small run, that's almost garantueed, but the install base is wider than that.