I don't follow what you mean -- which bit(s) doesn't the PPU use? My memory isn't that great but I think I remember testing each and every PPU address bit A0-A13 (total 14 bits) to see if each mattered for the matching address and I found that they all did matter to the MMC5.
Porting mmc5 PPU cycle counter from mister_nes
Moderator: Moderators
Re: Porting mmc5 PPU cycle counter from mister_nes
Re: Porting mmc5 PPU cycle counter from mister_nes
A13 must be 1 and A12 must be 0; not A13 and A12 must be same as the previous two fetches.Ben Boldt wrote: ↑Sun Feb 21, 2021 8:28 pm I don't follow what you mean -- which bit(s) doesn't the PPU use? My memory isn't that great but I think I remember testing each and every PPU address bit A0-A13 (total 14 bits) to see if each mattered for the matching address and I found that they all did matter to the MMC5.
Re: Porting mmc5 PPU cycle counter from mister_nes
Oh I see what you are saying now. Either it keeps enforcing A13=1, A12=0 and just compares A0-A11, or it only enforces it the first time and then has to compare all 14 bits (which would produce the same function). And you are saying that the former is more likely to be how it is actually implemented inside the chip. I totally agree with that. I think that ladder is simpler to explain and probably more efficient in an emulator so we might want to go with that but I do not disagree with your point.
Re: Porting mmc5 PPU cycle counter from mister_nes
Not only more likely; that was the entire point of what krzysiobal said:
krzysiobal wrote: ↑Tue Oct 02, 2018 3:34 pm No, you are right, only $2000-$2fff can set it. I haven't checked that before with so many datails.
Re: Porting mmc5 PPU cycle counter from mister_nes
I'm sorry I still don't follow how we can know any difference beyond just likelihood, maybe I am not understanding what you are trying to say.
I see these 2 scenarios about the 3 consecutive reads:
Edit:
I also can't see how what krzysiobal said indicates a choice between these two. I think we're on different wavelengths on this somehow, we are using the same words but we mean something different.
I see these 2 scenarios about the 3 consecutive reads:
- PPU reads from a nametable (A13=1, A12=0), then reads from the same 14-bit address 2 more times
- PPU reads from a nametable (A13=1, A12=0), then reads from the same bottom 12 bits 2 more times with (A13=1, A12=0)
Edit:
I also can't see how what krzysiobal said indicates a choice between these two. I think we're on different wavelengths on this somehow, we are using the same words but we mean something different.
Re: Porting mmc5 PPU cycle counter from mister_nes
For ex-attr mode it needs to generate nametable dynamically and switch pattern bank dynamically (with 4K granularity of single screen bg tile). The former needs to cheat ppu_data line, which also needs to redesig the hardware, to add 16 IOs to isolate the bi-directional data bus. That has exceeded the IO usage of qfp-100 package, should be the last mmc5 function supported with prioritylidnariq wrote: ↑Sun Feb 21, 2021 8:22 pmWhat's hard about these? We know the MMC5 must keep track of the current sliver # for the scanline, so it can just return (name+pattern table #1) for the first N, (name+pattern table #2) for the next 34-N, and (pattern table #3) for the final 8 sliver fetches. The only obnoxious part is that the "stutter" that the MMC5 looks for is between background sliver fetches #2 and #3. But that doesn't matter if you are breaking the abstraction between the emulated PPU and the emulated MMC5.The MMC5 has to be doing the fetch from the internal EXRAM from the same address and same time as the nametable fetch is happening. After that, the rest is straight-forward: it has to route the relevant bits from that extra 8 bits of nametable to the PPU's data bus during the attribute fetch, and to the CHR's address bus during the pattern fetches.3. ex-attr mode
Re: Porting mmc5 PPU cycle counter from mister_nes
Just to be clear: I'm not in any way disagreeing with you. I just forgot about A12 being involved in the mask. And was bad at reading you mentioning A12 in this post just a few moments prior: viewtopic.php?p=265486#p265486
After that, it was confusion about definitions of "use" and "latch". We know it "uses" all the address lines. But "use" is different from "latch". The 2A03 and 2C02 dice show they mostly remembered to remove silicon that was always in a fixed state (compare visual2a03 pcm_lc0 through pcm_lc3 to higher bits pcm_lc4 through pcm_lc11), although they didn't always (pcm_a14 exists).
After that, it was confusion about definitions of "use" and "latch". We know it "uses" all the address lines. But "use" is different from "latch". The 2A03 and 2C02 dice show they mostly remembered to remove silicon that was always in a fixed state (compare visual2a03 pcm_lc0 through pcm_lc3 to higher bits pcm_lc4 through pcm_lc11), although they didn't always (pcm_a14 exists).
Re: Porting mmc5 PPU cycle counter from mister_nes
OK that makes sense lidnariq.
Sorry I hijacked your thread aquasnake, that got away from me a little.
Sorry I hijacked your thread aquasnake, that got away from me a little.
Re: Porting mmc5 PPU cycle counter from mister_nes
So it's a resource utilization problem instead of a implementation problem? Getting data out of the emulated EXRAM to the PPU when it's needed?aquasnake wrote: ↑Sun Feb 21, 2021 9:10 pm For ex-attr mode it needs to generate nametable dynamically and switch pattern bank dynamically (with 4K granularity of single screen bg tile). The former needs to cheat ppu_data line, which also needs to redesig the hardware, to add 16 IOs to isolate the bi-directional data bus. That has exceeded the IO usage of qfp-100 package
It looks like 8x16 sprites should work fine despite that constraint? Because there you just have to use PPUA[10..12] during the 8 sprite slivers to select the bank. Right?
Re: Porting mmc5 PPU cycle counter from mister_nes
Any detail involved is welcome
Last edited by aquasnake on Tue Feb 23, 2021 10:54 pm, edited 1 time in total.
Re: Porting mmc5 PPU cycle counter from mister_nes
lidnariq wrote: ↑Sun Feb 21, 2021 9:23 pmSo it's a resource utilization problem instead of a implementation problem? Getting data out of the emulated EXRAM to the PPU when it's needed?aquasnake wrote: ↑Sun Feb 21, 2021 9:10 pm For ex-attr mode it needs to generate nametable dynamically and switch pattern bank dynamically (with 4K granularity of single screen bg tile). The former needs to cheat ppu_data line, which also needs to redesig the hardware, to add 16 IOs to isolate the bi-directional data bus. That has exceeded the IO usage of qfp-100 package
It looks like 8x16 sprites should work fine despite that constraint? Because there you just have to use PPUA[10..12] during the 8 sprite slivers to select the bank. Right?
Code: Select all
-SPR Mode- -PRG Mode- -CHR Mode- -Ext RAM Mode- -Name----------------------
8x16 4x8 8x1 ex-attr* - Just Breed
Re: Porting mmc5 PPU cycle counter from mister_nes
I think just detecting 3 consecutive read operations(including 2 virtual reads) when ppu_addr[13:12] = = 2'b10 to mark the start of a new scan line. It is not necessary to monitor all ppu address linesBen Boldt wrote: ↑Sun Feb 21, 2021 8:41 pm Oh I see what you are saying now. Either it keeps enforcing A13=1, A12=0 and just compares A0-A11, or it only enforces it the first time and then has to compare all 14 bits (which would produce the same function). And you are saying that the former is more likely to be how it is actually implemented inside the chip. I totally agree with that. I think that ladder is simpler to explain and probably more efficient in an emulator so we might want to go with that but I do not disagree with your point.
Re: Porting mmc5 PPU cycle counter from mister_nes
krzysiobal wrote: ↑Sun Nov 25, 2018 4:58 pm Ok, I did further researchs around how MMC5 detects 8x8/8x16 sprites and on which scanline cycles it uses $5120-$5127 and on which $5128-$512B. In order for this, I made a special test case for KrzysioKazzo that simulates CPU/PPU cycles (after each PPU cycle there is CPU cycle so MMC5 never thinks that reset state occured)
I set it into CHR 8x1k mode, write $FF to $5120-$5127 (sprites) and $00 to $5128-$512b (tiles) and observe CHR-A10 so that it is easy to distinguish, when it uses sprite/tiles banks.
1. MMC5 switches to 8x16 mode when
* $2000.5 is written with 1
AND
* at least one of those bits ($2000.3 or $2000.4) is written with 1
MMC5 sniffs writes to $2000/$2001 (it only checks for those two addresses, no mirrors are taken into account).
2. I will count the PPU reads in every scanline as #1, #2, #3, .. #170 (so that cycle 0->idle, cycles 1-2 -> #1, cycles 3-4 -> #2, .., 339-340 -> #170)
3. During the pre-render scanline it never uses sprites banks (logical, cause it needs three consecutive reads from $2000-$2fff to detect scanline)
4. During the scanline 0, it uses sprite banks for reads: #2, #3, #4 and #130-#161
For further scanlines - only #130-#161
Take look that CHR-A10 is changing on both edges of PPU_!RD which might reveal how the MMC5 ppu cycle counter is implemented (if I'd do that in VHDL, everything would change on the falling edge)
5. The counter is not only counting passing edge of PPU_!RD but it also looks at the addresses. So if PPU will be fetching from $0000, $0001, $0002, $0003, etc - it will not work. There must be some more logic underneath that.
6. The counter won't count passing scanlines - if the frame has even 400 scanlines, the above logic will work for all of them.
so here is the code:Guest wrote: ↑Sun Oct 16, 2005 9:25 am I think the way the MMC5 detects a new scanline is by looking for three consecutive nametable fetches. This only happens once per scanline, with the third fetch coming at PPU cycle 1 of a new scanline (numbering cycles from 0-340 ). I'm thinking that when the MMC5 sees three straight NT fetches, if checks the in-frame flag and, if clear, sets it and clears the scanline counter. If the in-frame flag is set, the scanline counter would be incremented and the IRQ flag set if the value matches what was written in $5203. The in-frame flag remains set until at least three PPU cycles pass without a VRAM fetch, at which point the flag is cleared. That's my theory, anyway - I'm sure there are other ways to do it.
I would be particularly interested in how the MMC5 knows if 8x8 or 8x16 sprites are in use. The only way I can think of is to monitor writes to $2000. Maybe someone can try writing to $3FF0 to try to trick it?
Code: Select all
// detects a new scanline by looking for three consecutive nametable fetches
// with the third fetch coming at PPU cycle 1 of a new scanline
always @ (negedge ppu_rd)
begin
ppu_rd_counter <= ppu_rd_counter + 1;
if (ppu_addr_in[13:12] == 2'b10)
begin
if (ppu_nt_rd_counter < 3)
ppu_nt_rd_counter <= ppu_nt_rd_counter + 1;
else
ppu_rd_counter <= 0;
end else
ppu_nt_rd_counter <= 0;
end
Code: Select all
wire [8:0] ppu_cycle = {ppu_rd_counter[7:0], 1'b1};
wire spr_fetch = (ppu_rd_counter >= 129) & (ppu_rd_counter <= 160);
Last edited by aquasnake on Mon Feb 22, 2021 7:20 pm, edited 1 time in total.
Re: Porting mmc5 PPU cycle counter from mister_nes
You could get a false positive here from CPU-triggered reads; I don't know if any games will rely on this but if so you may need to add some extra logic to protect against that.
Re: Porting mmc5 PPU cycle counter from mister_nes
Would counting M2s between PPU /RDs work?