So I finally managed to have a MMC1 with a set of 2 Atmel v750c chips. Those chips actually have only 10 "hidden pins" for a total of 20 registers (I previously tought they had more - the datasheet are extremely unclear and documents very poorly the possibilities of the chip). Both chips are used to 100% of their capacity ; all 40 registers, all I/O pins of the 1st chip and all output pins of the 2nd chips are used ; and making a MMC1 that way was barely possible. This is not yet tested on hardware, so it's likely it's still imperfect.
It's also possible to use a single Atmel v2500 chip, but then the chip is barely used to half of its total capacity.
How it work internally :
The shift register is made so that it is initialized to '11110' whenever a "reset" write or a "5th" write is made. When a 5th write is made, the upper bit becomes '0' and then the shift register know it can reinitialize itself to the value '11110'.
Internal registers are loaded with the 4 low bits of the shift register and the D0 line directly. They are inverted internally, because the Atmel chip guarantees that all registers are reset to '0' on power-on if some conditions are met (monotonic voltage increase and no clock oscillation before final voltage is reached). I hope those conditions are met with the NES, so that we can know all MMC1 registers are initialised with '11111', which means the last bank is always switched in at $c000-$ffff (and also at $8000-$bfff but that's less important). SRAM is also disabled on power-on. I believe this is how actual MMC1 chips behaves (at least the most common revisions). If it wouldn't behave like that, it would be difficult to have a reliable reset vector in ROM.
I did not design the shift register so that the first write after power on counts as an actual "first" write. Actually the shift register will reset to '00000' so the 1st write after poweron will be counted as a 5th write, using '0' for the 4 lower bits and D0 for the upper bit. Writing a value with D7=1 first is thus necessary in practice - I am unsure about how a real MMC1 behaves. If it is required than the first write after poweron is counted as the 1st write, it should be possible to do that by reversing the polarity of some bits in the shift registers in a smart way - but this would be yet another headache and an additional source of bugs.
Both the internal registers and the shift register are clocked with a partially decoded signal, called Write_to_MMC1. It's enabled when $8000-$ffff is written to and when this write is not blocked because it follows directly another write. Then the remaining decoding is done for the registers on the D pin. The reason the decoding is not done solely on the D pin is because there weren't enough product terms to do that for all pins - and that's also the reason the decoding is not done entirely on the clock pins. Normally adress lines should be stable when ROMSEL is active, so glitchy clocks on either the shift register or internal registers shouldn't be possible - but that could be a potential problem.
For the 750c version, the logic is cut in the following manner:
- 1st chip handles Reg0 and Reg3, the PRG-ROM switching, the PRG-RAM enable, the mirroring control and the Write_to_MMC1 clock decoding.
- 2nd chip handles Reg1 and Reg2, the CHR-ROM switching and the shift register.
Notice that the split boundaries are not very logical some signals goes back and forth between 2 chips - for example when writing to reg0 to set CHR-ROM switching mode the write logic is decoded in the 1st chip, the data passes through the shift register in the 2nd chip, is registered in reg0 on the 1st chip and then is output to be used by the 2nd chip. But that's the sole way it could be made to fit in 2 chips. Splitting it along more logical lines (PRG half and CHR half) would not fit in 2 chips. This also implies that both chips should be prevent, even if CHR-ROM switching is unused (typically for CHR-RAM boards).
Since the V750c chips are basically like 2 22V10pals tied together, this means it might be possible to do a MMC1 chips out of only 4 PAL22V10 chips and not 5 like I proposed in the original post, however doing so will be a major headache and might cut the functionality in even more illogical boundaries.
PS : To people who are going to say me : Why didn't you add feature X, Y or whathever - this was never my goal to create mappers, but instead to replicate the existing Nintendo MMC1 exactly as it is (including it's shortcomings) using a reasonable number ordinary orderable customer chips. Doing with 74xxx logic would require way too much chips - getting it done in one or 2 chips is nice - and that without requiring a surface mount component nor any regulator nor level shifters.