It is currently Wed Sep 19, 2018 1:37 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 4 posts ] 
Author Message
PostPosted: Mon Sep 10, 2018 4:37 pm 
Offline

Joined: Sun Jun 12, 2011 12:06 pm
Posts: 375
Location: Poland
I am in process of fixing blob-Rinco famiclone. Console works fine but... there is problem with joypad 1, which causes (even if no pad is present) every game to think that all buttons are pressed at once.
Image

1. Unfortunatelly, strobe & clock lines are dead. And shorting joypad-data to vcc/gnd does not stop games to detect all buttons being pressed.
2. I tried the old trick with putting 0/1 on the data bus when there is read cycle from $4016 but it also does not change anything. So probably this clone has the internal joypad port separated from cpu-bus.

Normally I would throw it into garbage but it has the unique PAL/NTSC region switch feature, so..

I tried my recent trick with code injection: http://forums.nesdev.com/viewtopic.php? ... 15#p224858

but this time, after read cycle from $4016 happens, I am injecting code that loads into accumulator value from joypad-data line (+ and I also generate clock/strobe signals). This value is read by FPGA from 'new' joypad port.

Code:
RnW ADDRESS     DATA      STATE  ;fix broken joypad
--------------------------------
1   xxxx        *           IDLE
1   $4016       *           IDLE
1   xxxx + 1    $A9         LDA1_1    ;simulate LDA #V, where V=%0100000d
1   xxxx + 2    V           LDA1_2    ;(d-current value of joypad data line, 1 to mimic open bus behaviour for some games)
1   xxxx + 3    $4C         JMP_1     ;jmp xxxx+1
1   xxxx + 4   LO(xxxx+1)   JMP_2
1   xxxx + 5   HI(xxxx+1)   JMP_3
1   xxxx + 1    *           IDLE


I checked a few games and all of them reads $4016 into accumulator and not inyo any other register. And all this code is executed from ROM, not RAM, so it should work. And it works! I managed to make joypad port working. I tested on some NROM/MMC1/UNROM games and all of them works.
However, when it comes to MMC3 (for example Doki Doki Yuuenchi, SMB3) - they hang just after second after start.

So I thought:
1. Does adding those cycles really make difference to the timing in code? -> No, I modified the ROM in emulator and changed the lda $4016 into jump to some free region + few nops + jump back -> game worked

2. Maybe during the opcode injection, when cartridge ROM is disabled, IRQ is triggered (because MMC3 is the only mapper from the above that uses - it). But looking on the scope, at the moment when something starts messing, there is no IRQ pending.
(I checked this even on second, working console, and it is exactly the same). I checked on true hardware MMC3 cartridge and flash cartridge and also.

Doki Doki read joypad routine looks like:
Image

But looking at the waveforms, there are tens of correct 8 reads from $4016, but suddenly there appears a moment that during some of that injected code (here - during fifth read from $4016), CPU address bus is the same for three cycleS! Even if CPU would read invalid data and treat it as opcode, it wouldn't stay for 3 cycles! Anyone has idea what am I missing? Maybe some weird DMA/APU thing is taking place at this moment?
Image


Top
 Profile  
 
PostPosted: Mon Sep 10, 2018 5:05 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20556
Location: NE Indiana, USA (NTSC)
That pattern of three reads in a row from $4016 sounds like the APU DMC DMA glitch. See "APU DMC" section "Conflict with controller and PPU read" on the wiki. If you also get freezes in Contra or Kung Fu or Hello Kitty World or LAN Master or other games that use a discrete mapper and sampled audio, or games like The Legend of Zelda or Dr. Mario that use MMC1 and sampled audio, then it's probably that.

Workaround: If you see two accesses to $4016 within 8 CPU cycles or so, ignore the later one.


Top
 Profile  
 
PostPosted: Mon Sep 10, 2018 5:44 pm 
Offline

Joined: Sun Jun 12, 2011 12:06 pm
Posts: 375
Location: Poland
Yes! Hello Kitty World freezes too, thanks!

I think the problem is harded than I thought, because this APU steal cycle does not only affects reading at $4016, but it can happen anytime, during reading any memory cell.

This makes hard to determine how the CPU is behaving, and If it happends during the injection process (for example - when I am injecting the $A9 operand), I cannot go to next state on next cycle because on next clock cycle, CPU address will be the same and I still have to feed the $A9.

So I need to watch addresses on the CPU bus and switch to next state only if address is incremented.


Top
 Profile  
 
PostPosted: Fri Sep 14, 2018 10:11 am 
Offline

Joined: Sun Jun 12, 2011 12:06 pm
Posts: 375
Location: Poland
Ha, I did it! I thought overnight how then everything works if CPU halts randomly. But looking from the memory's perspective, it does not matter if the executing sequence is xxxx, xxxx + 1, xxxx + 2 or xxxx, xxxx, xxxx, xxxx + 1, xxxx + 1. It just returns data from the cell that program counter wants!

So abandoning state-machine approach and trying to predict how many cpu cycles the whole sequence takes, FPGA must behave exactly the same way like memory - just return data from adequate cell. So now there are only 2 states: IDLE (normal mode) and INJECTION (during injection).

However, there are a few quirks:
* How to know when to start data injection - read cycle from $4016 is good point of switching to INJECTION (even if there are two or more such reads in a row - no matter, we are in INJECTION state)
* Which memory cells to return? Well, they are not fixed, because LDA $4016 is in diferent locations in every game, so afterwards CPU will try to fetch from yyyy, yyyy + 1, yyyy + 2 etc. Subtracking those addresses from xxxx (where xxxx is address of program counter before read cycle from $4016) gives absolue addresses o return
* How to know how to end injection - if we force the cpu to make bogus write cycle, presence of this write cycle is good point of switching back to IDLE (because write cycles cannot be repeated or blocked by APU DMC). So now, the table should look like:

Code:
RnW ADDRESS     DATA        state    comment
-------------------------------------------------------------------------------------------------------------------------
1   xxxx        *           IDLE     remember this address
1   $4016       *           IDLE     read from $4016 - switch to inject mode (at end of this cycle)
1   xxxx + 1    $A9         INJECT   simulate LDA #V, where V=%0100000d
1   xxxx + 2    V           INJECT   (d-current value of joypad data line, 1 to mimic open bus behaviour for some games)
1   xxxx + 3    $4C         INJECT   jmp xxxx-2
1   xxxx + 4    LO(xxxx-2)  INJECT               
1   xxxx + 5    HI(xxxx-2)  INJECT   
1   xxxx - 2    $8D         INJECT   simulate sta $8000     
1   xxxx - 1    $00         INJECT   
1   xxxx        $80         INJECT   
0   $8000       *           INJECT   when any write cycle appears - switch to idle mode (at end of this cycle)
1   xxxx + 1    *           IDLE     now we come back to execution of normal code


I made it and it worked even in those games with APU DMCs! But I had weird hangs in some games. The reason for that was that the
Code:
CART_ROMSEL <= CPU_ROMSEL when current_state = IDLE else '1'

was causing a short pulse on ROMSEL line just between the moment that INJECTION->INDLE but before next opcode. Changing it to
Code:
CART_ROMSEL <= CPU_ROMSEL when current_state = IDLE and CPU_M2 = '1' else '1'

solved it entirelly!

I used EPM3064 CPLD because the code took barelly 47 macrocells (out of 64), but this chip has 5V tolerant inputs. I projected small PCB, soldered it underneath the original cartridge slot (this model had 2) and cutted one trace.
Image Image Image

Unfortunatelly I realised that not only the joypad1 port signals were missing, but those for joypad2 too! Unfortunatelly all of 30 pins in this CPLD were used so I was unable to generate signals for JOY2-CLK / JOY2-D0.

I thought of 3 approaches:
* try to utilize JTAG pins from it too (that would give 4 additional pins). It is possible to program the chip so that afterwards those JTAG pins are regular I/Os, but future reprogramming is not possible without High Voltage programmer! But I found that one guy confirmed that applying +12V into OE pin and programming using regular programmer works! Unfortnuatelly OE pin was connected to one of address pins so applying 12V to it would be dangerous, so maybe..

* are all connected pins neccesary? A0..14, CPU_ROMSEL, CPU_R/W, M2, CART_ROMSEL, D0..D7, JOY-STROBE, JOY-CLOCK, JOY-D gives 30, but are all address lines neccesary?
I need to decode $4016/$4017 = 10000000001011X but maybe just OR A13..A5 using diodes and fed that ORed sum into FPGA? That would reduce a lot, but what about the JMP opcode which need to know precise addres? I can replace it with BRANCH which uses relative!

* Let just CPLD generate JOY-CLK and read JOY-D0 and try to use some kind of mux for J1-CLK/J2-CLK and demux for J1-D0/J2-D0. What would control those MUX/DEMUXes? A0! This is the only bit that distinguishes $4016/$4017 and all reading takes place in this cycle! Its logic table would be with corresponding circuits to generate them:
Code:
   IN           OUT                 IN          OUT
--------+--------------        ---------------+-----
A0  CLK | J1-CLK J2-CLK        A0  J1-D  J2-D |  D
 0    0 |   0       1           0    0     X  |  0
 0    1 |   1       1           0    1     X  |  1
 1    0 |   1       0           1    X     0  |  0
 1    1 |   1       1           1    X     1  |  1
 
                                          ______
 A0 -|>|------+                 A0   ----| NAND )o--+   ______
              |                 J2-D ----|______)   +--| NAND )o-- D
        +--R--+-- J1-CLK                  ______    +--|______)
CLK ----+                       ~A0  ----| NAND )o--+
        +--R--+-- J2-CLK        J1-D ----|______)
              |
~A0 -|>|------+


I took 7400 from shelf and connected it and it works (just needed to to add 56p caps from J1-CLK/J2-CLK to ground because of some oscillations) and both ports of this cosole works like charm!


Code:
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
use IEEE.STD_LOGIC_UNSIGNED.ALL;
use IEEE.STD_LOGIC_ARITH.ALL;

entity joypad_fix_epm3064 is
   port (
      cpu_a : in std_logic_vector(14 downto 0);
      cpu_d : inout std_logic_vector(7 downto 0);
      cpu_r_nw : in std_logic;
      cpu_nromsel : in std_logic;
      cart_nromsel : out std_logic;
      cpu_m2 : in std_logic;
      
      joy_data : in std_logic; --data
      joy_clk : out std_logic; --clock
      joy_strobe : out std_logic --strobe
      
      
   );
end;

architecture beh of joypad_fix_epm3064 is
   type T_DUTY_STATE is (IDLE, INJECT);

   signal duty_state : T_DUTY_STATE := IDLE;
   signal last_cpu_addr : std_logic_vector(15 downto 0);
   signal last_cpu_addr_minus2 : std_logic_vector(15 downto 0);
   
   signal dout_val : std_logic_vector(7 downto 0);
   signal joy_data_latched : std_logic;
begin
   
   cart_nromsel <= cpu_nromsel when duty_state = IDLE and cpu_m2 = '1' else
                   '1';
                  
   last_cpu_addr_minus2 <= last_cpu_addr - 2;
   
            
   cpu_d <= dout_val when cpu_r_nw = '1' and cpu_m2 = '1' and duty_state = INJECT else
            (others => 'Z');
            
   joy_strobe <= cpu_d(0) when cpu_r_nw = '0' and cpu_m2 = '1' and cpu_nromsel = '1' and cpu_a = "100000000010110"; --$4016
   joy_clk    <= '0'      when cpu_r_nw = '1' and cpu_m2 = '1' and cpu_nromsel = '1' and (cpu_a = "100000000010110" or cpu_a = "100000000010111") else '1'; --$4016/$4017
   joy_data_latched <= joy_data when cpu_r_nw = '1' and cpu_m2 = '1' and cpu_nromsel = '1' and (cpu_a = "100000000010110" or cpu_a = "100000000010111"); --$4016/$4017
   
   process (cpu_m2) is begin
      dout_val <= (others => '0');
      
      if cpu_m2 = '1' then
         if duty_state = INJECT then
            case cpu_a(2 downto 0) - last_cpu_addr(2 downto 0) is
            when "001" => -- xxxx + 1
               dout_val <= x"a9";
            when "010" => -- xxxx + 2
               dout_val <= "0100000" & (not joy_data_latched); --simulate open bus on upper bits
            when "011" => -- xxxx + 3
               dout_val <= x"4c";
            when "100" => -- xxxx + 4
               dout_val <= last_cpu_addr_minus2(7 downto 0);
            when "101" => -- xxxx + 5
               dout_val <= last_cpu_addr_minus2(15 downto 8);
            when "110" => -- xxxx - 2
               dout_val <= x"8d";
            when "111" => -- xxxx - 1
               dout_val <= x"00";
            when "000" => -- xxxx
               dout_val <= x"80";
            when others =>
            end case;
         end if;
      end if;
   end process;
               
   process(cpu_m2) is begin
      if falling_edge(cpu_m2) then
         case duty_state is
         when IDLE =>
            if cpu_r_nw = '1' and cpu_nromsel = '1' and last_cpu_addr(15) = '1' and (cpu_a = "100000000010110" or cpu_a = "100000000010111") then
               --only if current program execution is in $8000-$ffff, it can be altered
               duty_state <= INJECT;
            else
               last_cpu_addr <= (not cpu_nromsel) & cpu_a;
            end if;
         when INJECT =>
            if cpu_r_nw = '0' then
               duty_state <= IDLE;
            end if;
         when others =>
         end case;
      end if;
   end process;
   
end;


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 4 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group