RAMBO-1 Mapper Investigation

Discuss hardware-related topics, such as development cartridges, CopyNES, PowerPak, EPROMs, or whatever.

Moderators: B00daW, Moderators

lidnariq
Posts: 9008
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: RAMBO-1 Mapper Investigation

Post by lidnariq » Fri Dec 06, 2019 1:10 pm

NewRisingSun wrote:
Fri Dec 06, 2019 12:27 pm
I have made a cheap-looking diagram to indicate my understanding of the timings. Is this correct?
Yes, as far as I can tell.

I mean, there's extra details about how there are actually 12 different M2 alignments here (three each at pixel quanta, which shifts from scanline to scanline, and four each subpixel), but I don't think that changes any behavior in the Bg=0 Spr=1

Even the behavior of the idle pixel won't affect things here because there aren't enough M2 cycles after the sprite fetches finish before the idle pixel.

The only funny thing – true for MMC3 too – is what happens if Bg=0 Spr=0 due to the idle pixel. (i.e. under what conditions is A12 high during the idle pixel? always? PPU-internal noise? A function of PPUCTRL?)

User avatar
Ben Boldt
Posts: 486
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: RAMBO-1 Mapper Investigation

Post by Ben Boldt » Fri Dec 06, 2019 1:15 pm

Remember this:
- You can send the $C001 with no fewer than 4 M2 cycles with A12 = 0 preceeding it.

Could the $C001 be sent in fewer than 4 M2 cycles after the last sprite fetch? If so, my observation was that it will cause a +1 to the scanline that triggers IRQ.

NewRisingSun
Posts: 1123
Joined: Thu May 19, 2005 11:30 am

Re: RAMBO-1 Mapper Investigation

Post by NewRisingSun » Fri Dec 06, 2019 1:30 pm

"You can send the $C001 with no fewer than 4 M2 cycles with A12 = 0 preceeding it." What does that mean? A game can write to $C001 at any time it wants to.

I definitely agree that the timing of the $C001 write matters to get games to "look right". In my visually hardware-accurate-seeming emulation, I distinguish the case "$C001 written to with 16 M2 cycles or fewer since the last time PA12 was detected high" (N+1 in nocash's terminology) and "$C001 written to with more than 16 M2 since the last time PA12 was detected high" (N+2 in nocash's terminology). But I don't understand why the $C001 write timing would matter.

User avatar
Ben Boldt
Posts: 486
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: RAMBO-1 Mapper Investigation

Post by Ben Boldt » Fri Dec 06, 2019 1:37 pm

Sorry I didn't word that very well, yes you can obviously send that write at any time, there is nothing literally preventing you from writing to any memory location at any time...

What I meant is that if there have not been at least 4 M2 toggles with PA12 low before sending it, the NEXT time PA12 goes high, it won't detect that one. It doesn't matter how many M2 toggles with PA12 low or high after that, it will skip detecting the next scanline. Then the scanline after that will get detected as usual, resulting in +1 than expected.

NewRisingSun
Posts: 1123
Joined: Thu May 19, 2005 11:30 am

Re: RAMBO-1 Mapper Investigation

Post by NewRisingSun » Fri Dec 06, 2019 1:42 pm

Okay, but that observation is the opposite of what we need. What we need is a replication and explanation of N+2 behavior when $C001 is written to any time except with PA12 recently having been high.

User avatar
Ben Boldt
Posts: 486
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: RAMBO-1 Mapper Investigation

Post by Ben Boldt » Fri Dec 06, 2019 1:48 pm

I think it could explain it.

Always by default you get +1. If you set it to 32, per your example, it will happen every 33 scanlines. That's the normal condition. The first one in your example could be getting an extra +1 (i.e. +2) due to having been set in fewer than 4 M2s since the last sprite fetch.

NewRisingSun
Posts: 1123
Joined: Thu May 19, 2005 11:30 am

Re: RAMBO-1 Mapper Investigation

Post by NewRisingSun » Fri Dec 06, 2019 1:58 pm

The problem remains that exactly the reverse behavior is needed for the problematic games.

User avatar
Ben Boldt
Posts: 486
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: RAMBO-1 Mapper Investigation

Post by Ben Boldt » Fri Dec 06, 2019 2:49 pm

I am not aware where these games are writing $C001 in relation to sprite fetches. Are you referring to information that you know about this? Could you please explain this better because I don't understand.

Actually, better yet, if you give me a step-by-step test case, I will do it and tell you the result, with scope shots if you like.

NewRisingSun
Posts: 1123
Joined: Thu May 19, 2005 11:30 am

Re: RAMBO-1 Mapper Investigation

Post by NewRisingSun » Fri Dec 06, 2019 3:01 pm

Take one sample of Skull&Crossbones, for example:

C001 @scanline =247, PPU cycle=5: 21 M2 cycles with PA12 low since PA12 was high => game wants value (C000=0) +2
C001 @scanline =18, PPU cycle=330: 3 M2 cycles with PA12 low since PA12 was high => game wants value (C000=191)+1
C001 @scanline =212, PPU cycle=52: 24 M2 cycles with PA12 low since PA12 was high => game wants value (C000=14) +2

Hard Drivin' on the other hand during menus does all of its C001 writes between PPU cycle 14 and 25, and manages to achieve a constant value of 5 M2 cycles with PA12 low since PA12 was high, and always wants value +1 instead of +2. During racing while any road objects are visible, it does its C001 writes between PPU cycle 338 and 4, with 6-8 M2 cycles with PA12 low since PA12 was high, and still always wants a value +1 instead of +2.

One sample of Klax during normal gameplay:
C001 @scanline =251, cycle=322: 1364 M2 cycles with PA12 low since PA12 was high => game wants value (125) +2
C001 @scanline =127, cycle=54: 25 M2 cycles with PA12 low since PA12 was high => game wants value (6) +2

Klax, it must be said, is complicated because it uses 8x16 sprites (with BG@$0000), making the PA12 behavior more difficult to predict. The other two games use the normal BG @$0000, SPR @$1000 configuration. It also causes more IRQs by only writing to $C000, but not to $C001, expecting these IRQs after N+1 instead of N+2 scanlines.

Edit:
Ben Boldt wrote:Actually, better yet, if you give me a step-by-step test case, I will do it and tell you the result, with scope shots if you like.
Try the following experiment then:
  • Wait 64 M2 cycles with PA12=0
  • Write 00 to $C001, 32 to $C000, to $E001
  • Count scanlines until IRQ
  • Write to $E000 to acknowledge iRQ
  • Wait 6 M2 cycles with PA12=0
  • Write 00 to $C001, 32 to $C000, to $E001
  • Count scanlines until IRQ
Basically, do the same write sequence but the first time with a long PA12 pause, the second time immediately following a PA12 rise. By comparing 64 versus 6 M2 cycles with PA12=0 before writing to $C001, we can check the 16 M2 boundary that I think might be meaningful while avoiding the 4 M2 cycles boundary that you have found.

User avatar
Ben Boldt
Posts: 486
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: RAMBO-1 Mapper Investigation

Post by Ben Boldt » Fri Dec 06, 2019 4:45 pm

Here is my test code, sorry I had already done this before seeing your complete message, so it may be a little different. The RAMBO-1 would never issue IRQ unless I continued toggling PA12 through v-blank, note the commented out portion. Is that right? Does PA12 really do that? If not, please take a look for errors in my code, I have been staring at it for a long time and not finding it.

Code: Select all

void __attribute__((interrupt, no_auto_psv)) _T1Interrupt( void )
{
    #define TEST_SCANLINE   18
    #define TEST_M2_CYCLES  3
    #define TEST_C000_VALUE 191

    static uint16_t scanline_count = 0;
    static uint16_t m2_count = 0;
    static uint16_t cycles_since_a12_high = 0;
    static uint16_t initialize = 0;
    
    if( 0 == initialize )
    {
        DEBUG_PIN = 1;
        writeOnCpuBus(0xE000, 0x00);  // Disable/Ack IRQ.
        initialize++;
    }
    else if( 1 == initialize )
    {
        writeOnCpuBus(0xC000, TEST_C000_VALUE);  // Set IRQ counter reload value.
        initialize++;
    }
    else
    {
        cycles_since_a12_high++;

//        if( (scanline_count < 240) || (261 == scanline_count) )  // 'Normal' scanlines
        {
            if( m2_count < 192 )  // Background PT0/NT/AT fetches, PA12 = 0.
            {
                PPU_A12 = 0;
                asm(" repeat #12 \n\t nop ");  // Setup Delay
                readOnCpuBus(0x8000);  // Dummy read.
                m2_count++;
            }
            else if( m2_count < 240 )  // Sprite PT1 fetches, PA12 = 1.
            {
                PPU_A12 = 1;
                cycles_since_a12_high = 0;
                asm(" repeat #12 \n\t nop ");  // Setup Delay
                readOnCpuBus(0x8000);  // Dummy read.
                m2_count++;
            }
            else  // Remainder of scanline, Background PT0/NT/AT fetches, PA12 = 0.
            {
                PPU_A12 = 0;
                asm(" repeat #12 \n\t nop ");  // Setup Delay
                readOnCpuBus(0x8000);  // Dummy read.
                m2_count++;
            }
        }
//        else  // V-Blank scanlines.
//        {
//            PPU_A12 = 0;
//            asm(" repeat #12 \n\t nop ");  // Setup Delay
//            readOnCpuBus(0x8000);  // Dummy read.
//            m2_count++;
//        }

        // TEST CASE:
        if( TEST_SCANLINE == scanline_count )
        {
            if( TEST_M2_CYCLES == cycles_since_a12_high )
            {
                DEBUG_PIN = 0;
                asm(" repeat #12 \n\t nop ");  // Setup Delay
                writeOnCpuBus(0xC001, 0x00);  // Set IRQ mode to scanline counter mode, reset counter.
                asm(" repeat #12 \n\t nop ");  // Setup Delay
                writeOnCpuBus(0xE001, 0x00);  // Enable IRQ.
                m2_count += 2;
                cycles_since_a12_high += 2;
            }
        }

        // Check for end of scanline, wrapping to next scanline,
        // and also wrapping to next frame after last scanline.
        if( m2_count >= 256 )
        {
            m2_count -= 256;
            scanline_count++;
            if( scanline_count > 261 )
            {
                scanline_count = 0;  // Go to next frame.
            }
        }
    }
    IFS0bits.T1IF = 0;  // Clear Timer1 interrupt flag.
}
Anyway, this test code definitely replicates the 3 test cases:
C001 @scanline =247, PPU cycle=5: 21 M2 cycles with PA12 low since PA12 was high => game wants value (C000=0) +2
C001 @scanline =18, PPU cycle=330: 3 M2 cycles with PA12 low since PA12 was high => game wants value (C000=191)+1
C001 @scanline =212, PPU cycle=52: 24 M2 cycles with PA12 low since PA12 was high => game wants value (C000=14) +2
I find that in all 3 cases, moving around the "M2 cycles with PA12 low since PA12 was high" value, always was +1 with values 0-15, and always +2 with value 16+. The threshold was 16, all 3 cases. I have no idea why this test is different now, where I was seeing +2 with values 0-3 and +1 with 4+. Would be good to dig in and figure that out -- there is clearly something more going on here in order to explain that.

I ran the test at power on and triggered my scope on it. In this scopeshot, 247/21/0 example, showing +2.
tek00034.png

Edit:

Just to clarify, the 2nd test with $C000 value 191, I did not count the scanlines, but I did see that 3 vs 15 M2 cycles was the same and there was 1 more with 16 M2s. The other 2 test cases I did count them.


Edit 2:

Here is 247/15/0 for comparison, showing +1.
tek00035.png

NewRisingSun
Posts: 1123
Joined: Thu May 19, 2005 11:30 am

Re: RAMBO-1 Mapper Investigation

Post by NewRisingSun » Sat Dec 07, 2019 12:13 am

I have trouble understanding the code. What is "m2_count"? "m2_count -= 256" --- there are neither 256 pixels (that would be 341) nor 256 M2 cycles (that would be 113+2/3) M2 cycles per scanline?
Ben Boldt wrote: The RAMBO-1 would never issue IRQ unless I continued toggling PA12 through v-blank, note the commented out portion. Is that right?
The PPU itself never toggles PA12 throughout vertical blanking, though the game may write to NTRAM and palette RAM which will cause a manual toggle of PA12. But that does not matter, as no game sets up a scanline counter sequence before vertical blanking and expects it to continue after vertical blanking. They will do their NTRAM and palette writes in vertical blanking, then set up the scanline counter sequence at the end of vertical blanking, expecting the counter to only be clocked starting with the pre-render scanline.

I understand your results as the following:
  • If $C001 is written to within 16 M2 cycles since the last time PA12 was high, the next high PA12 will cause the internal counter to be reloaded, and all subsequent high PA12 (after 16 M2 cycles have passed) will cause the counter to decrease/increase, yielding a scanline count of N+1.
  • If $C001 is written to not within 16 M2 cycles since the last time PA12 was high, the next high PA12 will be ignored, then the next high PA12 will cause the internal counter to be reloaded, and then the next PA12 but subsequent high PA12 (after 16 M2 cycles have passed) will cause the counter to decrease/increase, yielding a scanline count of N+2.
That matches what the games are expecting. What puzzles me is what the $C001 write does within the chip that would cause the next PA12 being high to be ignored.

Another thing to check for is the behavior when a value of 0 is written to $C000. Hard Drivin' does that all the time on the race track while other objects are visible in order to get an IRQ every scanline. Please try:
  • Wait 64 M2 cycles with PA12=0
  • Write 00 to $C001, 0 to $C000, to $E001
  • Count scanlines until IRQ (result 1)
  • Write to $E000 to acknowledge IRQ, then immediately to $E001 to re-enable it without changing the reload value
  • Count scanlines until IRQ (result 2)
  • Wait 6 M2 cycles with PA12=0
  • Write 00 to $C001, 32 to $C000, to $E001
  • Count scanlines until IRQ (result 3)
  • Write to $E000 to acknowledge IRQ, then immediately to $E001 to re-enable it without changing the reload value
  • Count scanlines until IRQ (result 4)
According to nocash's old documentation, you should get no IRQ at all at results 2 and 4.
Ben Boldt wrote: I have no idea why this test is different now, where I was seeing +2 with values 0-3 and +1 with 4+.
Are you sure you did not accidently use the M2 counter instead of the scanline counter before?

Edit:
The only explanation that I could come up with for the scanline count being N+2 instead of N+1 if written outside the 16-M2-cycles interval is that writing to $C001 does not directly set the "reload" flag as it did on the MMC3, but instead sets a second flag, "forceReload". "forceReload" causes the actual "reload" flag to be set the moment that the sixteenth M2 cycle with PA12 low is encountered and the PA12->Counter Clock is no longer inhibited. This means when writing to $C001 within the 16-M2-cycles interval, forceReload->reload happens within the next 16 M2 cycles, and so during the next cycle 260 or so, the counter is reloaded, hence N+1. And when writing to $C001 outside the 16-M2-cycles interval, the next time a sprite is fetched, no reload happens, because the reload flag is only set at the end of the 16 M2 interval begun with that sprite fetch, then on the next scanline's cycle 260 is the counter reloaded, and then on the further next scanline does any counting begin, hence N+2.

Edit 2:
Yes, that seems to work almost nicely in emulation. "Almost", because I still have some source of jitter that I don't fully understand when compared to my previous "higher-level" emulation that just uses a normal MMC3-like reload flag, but just adds one to the reload value if the $C001 occurs during the 16-M2-cycle PA12 filter ("reloadExtra =pa12Filter? 0: 1";). The jitter seems to be related to whether the M2 prescaler and the PA12 filter are halted while the other counting mode is selected, when exactly the prescaler is reset and to what value (0 or 3), and things like that. Skull&Crossbones is particularly sensitive, because it constantly switches between the M2 and PA12 counting modes. Investigating those minutiae will be quite a chore.

User avatar
Ben Boldt
Posts: 486
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: RAMBO-1 Mapper Investigation

Post by Ben Boldt » Sat Dec 07, 2019 12:10 pm

I think I did
256 PPU cycles * (3/4) = 192 M2s
320 PPU * (3/4) = 240 M2s
340 PPU * (3/4) = 255 M2s

My bad, I should have used 2/3. Thanks for finding that. What is really bizarre though, even with that error, I should have gotten an IRQ eventually at the wrong spot without the V-Blank toggling PA12, since this code sets everything up for a 1-shot and doesn't ever attempt to ack, stop, or restart the counter/IRQ on subsequent frames. Nothing we know would prevent the IRQ in that way.

I think I will probably make the trek in to the lab today and run your test and poke at things some more. As far as 'did I goof up a previous test', not unlikley. Will have to revisit it.

User avatar
Ben Boldt
Posts: 486
Joined: Tue Mar 22, 2016 8:27 pm
Location: Minnesota, USA

Re: RAMBO-1 Mapper Investigation

Post by Ben Boldt » Sat Dec 07, 2019 4:07 pm

Okay, I think I know why it didn't work when I didn't toggle PA12 during the V-Blank. Either I am not interpreting this right or your test condition doesn't make sense. Let's look at this one:

C001 @scanline =247, PPU cycle=5: 21 M2 cycles with PA12 low since PA12 was high => game wants value (C000=0) +2

What my test code does is keeps a running count of M2 cycles since PA12 was last high during the M2 cycle, variable "cycles_since_a12_high". So, for the test condition, it waits for scanline 247, then when cycles_since_a12_high == 21, it writes $00 to $C001 at that time, followed by $00 to $E001. BUT, since scanline 247 is in the middle of V-blank, the test code has already accumulated 7 scanlines worth of M2 cycles where PA12 was never high, somewhere on the order of 1500-1600. I won't hit the number 21 that way, so it won't send $C001/$E001, IRQ never gets turned on, so it never goes low, problem identified.

What do you mean by 21 -- is that 21 since the beginning of the scanline? I think an easy quick fix is to set my cycles_since_a12_high = 0 at the beginning of each scanline. Let me know if this sounds right or if you meant something else with 21, for example, if I should be sending PA12=1 for the idle cycle or something, per what lidnariq was talking about.

NewRisingSun
Posts: 1123
Joined: Thu May 19, 2005 11:30 am

Re: RAMBO-1 Mapper Investigation

Post by NewRisingSun » Sat Dec 07, 2019 4:12 pm

Ben Boldt wrote:What do you mean by 21 -- is that 21 since the beginning of the scanline?
No, it means 21 M2 cycles since the last time that PA12 was high, whatever the reason for that was. WIth that particular write, the game just wants an IRQ to occur at around pixel 288 of scanline 0.

tepples
Posts: 21839
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: RAMBO-1 Mapper Investigation

Post by tepples » Sat Dec 07, 2019 4:19 pm

On NTSC and Dendy, 21 M2 cycles happen to be 63 dots, or close to the entire duration of sprite fetch time.

Post Reply