3DS reverse engineering

Discussion of development of software for any "obsolete" computer or video game system.
zoogie
Posts: 10
Joined: Sat Nov 10, 2018 5:38 pm

Re: 3DS reverse engineering

Post by zoogie » Tue Feb 25, 2020 9:30 pm

nocash wrote:
Tue Feb 25, 2020 7:06 pm
I have finally brewed up some code for loading/decrypting/decompressing .code files from eMMC, the current purpose would be loading the original MCU firmware before patching. It's working okay, but I could imagine a few possible problems:

1) I am ignoring the .tmd/.cmd files, and just load "ncsd:\title\00040130\?0001f02\content\000000??.app" (the first "?" wildcard for Old3DS-vs-New3DS, and the other two "??" for the version number). I guess that should be safe, assuming that the console won't ever have more than one MCU version installed?

2) The MCU file is loaded from eMMC, whilst hacked consoles might actually load it from SD card, so there might be a version mismatch. I can compare the version numbers from registers MCU[00h..01h], and check if they are same as in the MCU file. If they don't match, then I could retry loading the file from SD card, and check if that does contain a matching version...
The thing that could crash the MCU would be a false-match between Old3DS and New3DS files with same MCU version number. I don't know if Old3DS+New3DS MCU's do ever share the same MCU version numbers (and if yes, if there's a way to distinguish them)?

3) The NCCH/.code files are originally encrypted via AES key 2Ch. But firmware 7.0.0 and up are said to additionaly use AES key 25h/18h/1Bh, I don't have those newer keys implemented... and I got told that my console has firmware 9.1.0... nethertheless, the decryption does mysteriously work okay with key 2Ch.
My current theory is that key 25h/18h/1Bh are maybe used only for shop-titles, but never for pre-installed system files. Could that be right?
1. Right, installed ctrnand titles don't have tmd checked at all or cmd file's MACs checked (save a Download Play child .cia). DSiWare twlnand app (00030004) titles do have cmd MACs checked for some reason. I can't tell you if swapping out various .app files is safe or not.
2. There doesn't appear to be any version collisions between 3ds models: https://www.3dbrew.org/wiki/MCU_Service ... e_versions
Note that 3.65 is the latest for all new3ds models (the wiki is wrong about that detail).
3. Certain titles may have 0x25, 0x1b, 0x18, or 0x2c -- it really doesn't seem to follow any logical pattern. It does appear that new3ds mcu module changed from 0x2c -> 0x1b at some point (probably the new2xl update version 9216, or firm 11.4/11.5). Old3ds has always been 0x2c.

nocash
Posts: 1229
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: 3DS reverse engineering

Post by nocash » Thu Feb 27, 2020 11:57 am

zoogie wrote:
Tue Feb 25, 2020 9:30 pm
3. Certain titles may have 0x25, 0x1b, 0x18, or 0x2c -- it really doesn't seem to follow any logical pattern. It does appear that new3ds mcu module changed from 0x2c -> 0x1b at some point (probably the new2xl update version 9216, or firm 11.4/11.5). Old3ds has always been 0x2c.
Good to know. Hmmm, I don't have that newer mcu file for supporting/testing that case, and also don't have a SEEDDB file (if that is also used).
zoogie wrote:
Tue Feb 25, 2020 9:30 pm
2. There doesn't appear to be any version collisions between 3ds models: https://www.3dbrew.org/wiki/MCU_Service ... e_versions
Yeah, looks like so.
Installing a wrong MCU firmware isn't so good: I've tried installing an Old2DS MCU firmware on New3DS-XL, and that did merely show blue power led for a brief moment on power-up, and then died.
The other way around, the New3DS-XL MCU disassembly (v3.56) looks as if it contains support for four different consoles (New3DS, New3DS-XL, Old3DS, Old2DS) (ie. everything except Old3DS-XL and New2DS-XL). Don't know if it is really possible to install v3.56 on Old3DS/Old2DS though, maybe it does just contain some code relicts with partial support for older consoles.
zoogie wrote:
Tue Feb 25, 2020 9:30 pm
1. Right, installed ctrnand titles don't have tmd checked at all or cmd file's MACs checked (save a Download Play child .cia). DSiWare twlnand app (00030004) titles do have cmd MACs checked for some reason.
Okay, then I won't need cmd/tmd, except perhaps for files that have separate executable+manual files. Anyways, I've looked into the .cmd file, and found some docs for it:
http://www.3dbrew.org/wiki/Category:File_formats - says that .cmd files are encrypted (???)
http://www.3dbrew.org/wiki/Title_Data_Structure - says that there are MAC's and a "usually (always?) 1" value (???)
That doesn't really match up with what I have in my .cmd files:

Code: Select all

Content Metadata (.cmd) file format
  00h     4    File number for 000000nn.cmd file (eg. 3=00000003.cmd)
  04h     4    Number of Contents (N)
  08h     4    Number of Contents (N) (same as above)
  0Ch     4h   Zerofilled  ;reportedly always 1  ;\maybe shop titles??
  10h     10h  Zerofilled  ;reportedly AES-MAC   ;/or SD card exports???
  20h     N*4  List of file numbers for 000000nn.app files
  20h+N*4 N*4  List of file numbers for 000000nn.app files (same as above)
  20h+N*8 -    Nothing     ;reportedly more AES-MAC's ???
The .cmd file is reportedly "encrypted with a console unique keyslot"??? (actually, it isn't encrypted at all).
Maybe all that stuff about encryption, and MAC's, and always 1's does apply only for... SD card titles... or shop titles?
EDIT: Yes, the wiki does somewhat say so, too: "The below AES-CMACs(including the last 0x10-bytes of the header) are only used for SD titles, for NAND download-play titles, and non-system DSiWare titles. For other titles, these MACs are set to all-zero."
I guess the "always 1" might indicate whether the MAC's (and/or encryption) are there or not.
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

zoogie
Posts: 10
Joined: Sat Nov 10, 2018 5:38 pm

Re: 3DS reverse engineering

Post by zoogie » Thu Feb 27, 2020 3:05 pm

nocash wrote:
Thu Feb 27, 2020 11:57 am
zoogie wrote:
Tue Feb 25, 2020 9:30 pm
3. Certain titles may have 0x25, 0x1b, 0x18, or 0x2c -- it really doesn't seem to follow any logical pattern. It does appear that new3ds mcu module changed from 0x2c -> 0x1b at some point (probably the new2xl update version 9216, or firm 11.4/11.5). Old3ds has always been 0x2c.
Good to know. Hmmm, I don't have that newer mcu file for supporting/testing that case, and also don't have a SEEDDB file (if that is also used).
zoogie wrote:
Tue Feb 25, 2020 9:30 pm
2. There doesn't appear to be any version collisions between 3ds models: https://www.3dbrew.org/wiki/MCU_Service ... e_versions
Yeah, looks like so.
Installing a wrong MCU firmware isn't so good: I've tried installing an Old2DS MCU firmware on New3DS-XL, and that did merely show blue power led for a brief moment on power-up, and then died.
The other way around, the New3DS-XL MCU disassembly (v3.56) looks as if it contains support for four different consoles (New3DS, New3DS-XL, Old3DS, Old2DS) (ie. everything except Old3DS-XL and New2DS-XL). Don't know if it is really possible to install v3.56 on Old3DS/Old2DS though, maybe it does just contain some code relicts with partial support for older consoles.
zoogie wrote:
Tue Feb 25, 2020 9:30 pm
1. Right, installed ctrnand titles don't have tmd checked at all or cmd file's MACs checked (save a Download Play child .cia). DSiWare twlnand app (00030004) titles do have cmd MACs checked for some reason.
Okay, then I won't need cmd/tmd, except perhaps for files that have separate executable+manual files. Anyways, I've looked into the .cmd file, and found some docs for it:
http://www.3dbrew.org/wiki/Category:File_formats - says that .cmd files are encrypted (???)
http://www.3dbrew.org/wiki/Title_Data_Structure - says that there are MAC's and a "usually (always?) 1" value (???)
That doesn't really match up with what I have in my .cmd files:

Code: Select all

Content Metadata (.cmd) file format
  00h     4    File number for 000000nn.cmd file (eg. 3=00000003.cmd)
  04h     4    Number of Contents (N)
  08h     4    Number of Contents (N) (same as above)
  0Ch     4h   Zerofilled  ;reportedly always 1  ;\maybe shop titles??
  10h     10h  Zerofilled  ;reportedly AES-MAC   ;/or SD card exports???
  20h     N*4  List of file numbers for 000000nn.app files
  20h+N*4 N*4  List of file numbers for 000000nn.app files (same as above)
  20h+N*8 -    Nothing     ;reportedly more AES-MAC's ???
The .cmd file is reportedly "encrypted with a console unique keyslot"??? (actually, it isn't encrypted at all).
Maybe all that stuff about encryption, and MAC's, and always 1's does apply only for... SD card titles... or shop titles?
EDIT: Yes, the wiki does somewhat say so, too: "The below AES-CMACs(including the last 0x10-bytes of the header) are only used for SD titles, for NAND download-play titles, and non-system DSiWare titles. For other titles, these MACs are set to all-zero."
I guess the "always 1" might indicate whether the MAC's (and/or encryption) are there or not.
Here's where the actual .cmd CMAC crypto is described:
https://www.3dbrew.org/wiki/Titles (search "<ContentID>.cmd")

Code example:
https://github.com/ihaveamac/cmd-gen

nocash
Posts: 1229
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: 3DS reverse engineering

Post by nocash » Thu Mar 05, 2020 7:05 am

I am looking into ROM cartridge registers. I have found some info here:
http://www.3dbrew.org/wiki/Gamecards - info on cartridge protocol
http://www.3dbrew.org/wiki/CTRCARD_Registers - info on cartridge registers
http://www.3dbrew.org/wiki/NCSD - info CTRCARD_SECSEED
going by that, it is or was known how to send cartridge commands, but it needs some guessing to reproduce. Is there more info or sample code somewhere?

What I can do:
- Use NTRCARD registers to send 8-byte commands (and enter 16-byte command mode)
- Use CTRCARD registers to send the first/unencrypted 16-byte command (and get the "cart header")
- Use AES-CCM to decrypt the 16-byte CTRCARD_SECSEED value in the cart header (that's working, with correct AES-MAC result)
And then I am stuck, I assume that I should somehow apply the CTRCARD_SECSEED, and then send encrypted 16-byte commands, but I am getting only garbage in the decrypted responses.

CTRCARD_SECCNT
This does reportedly allow to select a key, and to latch a seed. The seed is probably what was written to CTRCARD_SECSEED. And the four keys might be a built-in constants. But there is no info when to use which key(s).
NB. there is also a ready flag (bit14, gets cleared for a moment when setting both bit2 and bit15). And some unknown bits (bit0,1).

CTRCARD_CMD
This is reportedly 16-byte command, with little-endian word order, and big-endian byte order. But that's apparently wrong, or not always true. Sending command 82000000000000000000000000000000 requires to write 82h to byte[15], meaning that the byte-order and word-order are both little endian.
Further commands like 8300000000000000708DF1A731717D0B aren't working yet, maybe the lower bytes do actually require big endian adjustments? I've no idea where those lower bytes come from anyways, they might be fixed, or random, or related to one of the 8-byte commands (that with 71C93FE9BB0A3B18), and/or related to the cart header?

CTRCARD_CNT
That is more or less well described, apart from some less important undocumented bits. Bit8 seems to be an error flag (gets set when Reset wasn't released). And caution: Releasing reset isn't working when simultaneously setting bit31.

Oh, and was the actual encryption hacked? Ie. is it possible to build 3DS cartridges that work on real hardware (without needing exploits)? Asking just for curiosity, I guess it would be only useful for pirate copies, not for homebrew (which would work better from SD card, and which need exploits to bypass RSA anyways).

EDIT: I've found that "godmode9" can dump 3ds cartridges, that seems to contain everything needed to send commands and initialize key/seeds.
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

profi200
Posts: 46
Joined: Fri May 10, 2019 4:48 am

Re: 3DS reverse engineering

Post by profi200 » Fri Mar 06, 2020 5:53 am

See this which implements the whole init almost identical to Process9. Some delays got changed to be smaller or removed but it's working for all of my cards and i have at least 10 different ones including "CARD2" (uses NAND flash for the ROM internally and has a writable ROM area for the savegame):

Note: Some of the cmd names are guesses.
https://gist.github.com/profi200/ad0e88 ... 153d8edb60

And no, it has not been publicly broken. A chinese company sells "Sky3DS" cards which do understand the CTR card protocol crypto and pass secure init (works on stock firmware).

nocash
Posts: 1229
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: 3DS reverse engineering

Post by nocash » Fri Mar 06, 2020 11:40 am

Code: Select all

#pragma once

#include "types.h"
#include "arm9/ncch.h"



#define CARD_ENABLE     (1u<<15)
#define CARD_SPI_ENABLE (1u<<13)
#define CARD_SPI_BUSY   (1u<<7)
#define CARD_SPI_HOLD   (1u<<6)


void resetCardslot(void);
bool gamecardInit(void);
u32  getChipId(void);
u32  getCardType(void);


//////////////////////////////////
//            NTRCARD           //
//////////////////////////////////

#define NTRCARD_PAGESIZE_0         (0u<<24)
#define NTRCARD_PAGESIZE_4         (7u<<24)
#define NTRCARD_PAGESIZE_512       (1u<<24)
#define NTRCARD_PAGESIZE_1K        (2u<<24)
#define NTRCARD_PAGESIZE_2K        (3u<<24)
#define NTRCARD_PAGESIZE_4K        (4u<<24)
#define NTRCARD_PAGESIZE_8K        (5u<<24)
#define NTRCARD_PAGESIZE_16K       (6u<<24)

#define NTRCARD_ACTIVATE           (1u<<31)           // when writing, get the ball rolling
#define NTRCARD_WR                 (1u<<30)           // Card write enable
#define NTRCARD_nRESET             (1u<<29)           // value on the /reset pin (1 = high out, not a reset state, 0 = low out = in reset)
#define NTRCARD_SEC_LARGE          (1u<<28)           // Use "other" secure area mode, which tranfers blocks of 0x1000 bytes at a time
#define NTRCARD_CLK_SLOW           (1u<<27)           // Transfer clock rate (0 = 6.7MHz, 1 = 4.2MHz)
#define NTRCARD_BLK_SIZE(n)        ((n & 0x7u)<<24)   // Transfer block size, (0 = None, 1..6 = (0x100 << n) bytes, 7 = 4 bytes)
#define NTRCARD_SEC_CMD            (1u<<22)           // The command transfer will be hardware encrypted (KEY2)
#define NTRCARD_DELAY2(n)          ((n & 0x3Fu)<<16)  // Transfer delay length part 2
#define NTRCARD_SEC_SEED           (1u<<15)           // Apply encryption (KEY2) seed to hardware registers
#define NTRCARD_SEC_EN             (1u<<14)           // Security enable
#define NTRCARD_SEC_DAT            (1u<<13)           // The data transfer will be hardware encrypted (KEY2)
#define NTRCARD_DELAY1(n)          (n & 0x1FFFu)      // Transfer delay length part 1

// 3 bits in b10..b8 indicate something
// read bits
#define NTRCARD_BUSY               (1u<<31)           // when reading, still expecting incomming data?
#define NTRCARD_DATA_READY         (1u<<23)           // when reading, REG_NTRCARDFIFO has another word of data and is good to go

// Card commands
#define NTRCARD_CMD_DUMMY          (0x9Fu)
#define NTRCARD_CMD_HEADER_READ    (0x00u)
#define NTRCARD_CMD_HEADER_CHIPID  (0x90u)
#define NTRCARD_CMD_ACTIVATE_BF    (0x3Cu)  // Go into blowfish (KEY1) encryption mode
#define NTRCARD_CMD_ACTIVATE_SEC   (0x40u)  // Go into hardware (KEY2) encryption mode
#define NTRCARD_CMD_SECURE_CHIPID  (0x10u)
#define NTRCARD_CMD_SECURE_READ    (0x20u)
#define NTRCARD_CMD_DISABLE_SEC    (0x60u)  // Leave hardware (KEY2) encryption mode
#define NTRCARD_CMD_DATA_MODE      (0xA0u)
#define NTRCARD_CMD_DATA_READ      (0xB7u)
#define NTRCARD_CMD_DATA_CHIPID    (0xB8u)

#define NTRCARD_CR1_ENABLE         (0x8000u)
#define NTRCARD_CR1_IRQ            (0x4000u)

#define NTRKEY_PARAM               (0x3F1FFFu)


void ntrcardCommand(const u32 command[2], u32 pageSize, u32 latency, void* buffer);


//////////////////////////////////
//            CTRCARD           //
//////////////////////////////////

#define CTRCARD_PAGESIZE_0   (0u<<16)
#define CTRCARD_PAGESIZE_4   (1u<<16)
#define CTRCARD_PAGESIZE_16  (2u<<16)
#define CTRCARD_PAGESIZE_64  (3u<<16)
#define CTRCARD_PAGESIZE_512 (4u<<16)
#define CTRCARD_PAGESIZE_1K  (5u<<16)
#define CTRCARD_PAGESIZE_2K  (6u<<16)
#define CTRCARD_PAGESIZE_4K  (7u<<16)
#define CTRCARD_PAGESIZE_16K (8u<<16)
#define CTRCARD_PAGESIZE_64K (9u<<16)

#define CTRCARD_CRC_ERROR    (1u<<4)
#define CTRCARD_ACTIVATE     (1u<<31)           // when writing, get the ball rolling
#define CTRCARD_IE           (1u<<30)           // Interrupt enable
#define CTRCARD_WR           (1u<<29)           // Card write enable
#define CTRCARD_nRESET       (1u<<28)           // value on the /reset pin (1 = high out, not a reset state, 0 = low out = in reset)
#define CTRCARD_BLK_SIZE(n)  ((n & 0xFu)<<16)   // Transfer block size

#define CTRCARD_BUSY         (1u<<31)           // when reading, still expecting incomming data?
#define CTRCARD_DATA_READY   (1u<<27)           // when reading, REG_CTRCARDFIFO has another word of data and is good to go

#define CTRKEY_PARAM         (0x1000000u)


void ctrcardCommand(const u32 command[4], u32 pageSize, u32 blocks, u32 latency, void* buffer);
void ctrcardReadData(u32 sector, u32 length, u32 blocks, void* buffer);
NCCH_header* ctrcardGetHeader(void);
void ctrcardGetUniqueId(u32 *buf);

Code: Select all

#include <string.h>
#include "mem_map.h"
#include "types.h"
#include "util.h"
#include "arm9/hardware/gamecard.h"
#include "arm9/ncch.h"
#include "arm9/hardware/crypto.h"
#include "arm9/hardware/timer.h"


// TODO: This belongs in cfg9.h.
#define CFG_REGS_BASE         (IO_MEM_ARM9_ONLY)
#define REG_CFG9_CARDCTL      *((vu16*)(CFG_REGS_BASE + 0x0000C))
#define REG_CFG9_CARDSTATUS   *((vu8* )(CFG_REGS_BASE + 0x00010))
#define REG_CFG9_CARDCYCLES0  *((vu16*)(CFG_REGS_BASE + 0x00012))
#define REG_CFG9_CARDCYCLES1  *((vu16*)(CFG_REGS_BASE + 0x00014))

#define CTRCARD_REGS_BASE     (IO_MEM_ARM9_ONLY + 0x4000)
#define REG_CTRCARDCNT        *((vu32*)(CTRCARD_REGS_BASE + 0x00))
#define REG_CTRCARDBLKCNT     *((vu32*)(CTRCARD_REGS_BASE + 0x04))
#define REG_CTRCARDSECCNT     *((vu32*)(CTRCARD_REGS_BASE + 0x08))
#define REG_CTRCARDSECSEED    *((vu32*)(CTRCARD_REGS_BASE + 0x10))
#define REGs_CTRCARDCMD        ((vu32*)(CTRCARD_REGS_BASE + 0x20))
#define REG_CTRCARDFIFO       *((vu32*)(CTRCARD_REGS_BASE + 0x30))

#define NTRCARD_REGS_BASE     (IO_MEM_ARM9_ARM11 + 0x64000)
#define REG_NTRCARDMCNT       *((vu16*)(NTRCARD_REGS_BASE + 0x00))
#define REG_NTRCARDMDATA      *((vu16*)(NTRCARD_REGS_BASE + 0x02))
#define REG_NTRCARDROMCNT     *((vu32*)(NTRCARD_REGS_BASE + 0x04))
#define REGs_NTRCARDCMD        ((vu32*)(NTRCARD_REGS_BASE + 0x08))
#define REG_NTRCARDSEEDX_L    *((vu32*)(NTRCARD_REGS_BASE + 0x10))
#define REG_NTRCARDSEEDY_L    *((vu32*)(NTRCARD_REGS_BASE + 0x14))
#define REG_NTRCARDSEEDX_H    *((vu16*)(NTRCARD_REGS_BASE + 0x18))
#define REG_NTRCARDSEEDY_H    *((vu16*)(NTRCARD_REGS_BASE + 0x1A))
#define REG_NTRCARDFIFO       *((vu32*)(NTRCARD_REGS_BASE + 0x1C))


static u32 chipId, cardType;
static u32 cmdRand1, cmdRand2;
static NCCH_header ctrcardHeader;



static bool ctrcardSecureInit(u32 cmdBuf[4]);

void resetCardslot(void)
{
	REG_CFG9_CARDCYCLES0 = 0x1988;
	REG_CFG9_CARDCYCLES1 = 0x264C;
	// boot9 waits here. Unnecessary?

	REG_CFG9_CARDSTATUS = 3u<<2;     // Request power off
	while(REG_CFG9_CARDSTATUS != 0); // Aotomatically changes to 0 (off)
	TIMER_sleep(1);

	REG_CFG9_CARDSTATUS = 1u<<2;     // Prepare power on
	TIMER_sleep(10);

	REG_CFG9_CARDSTATUS = 2u<<2;     // Power on
	TIMER_sleep(27);

	// Switch to NTRCARD controller.
	REG_CFG9_CARDCTL = 0;          // Select NTRCARD controller, eject IRQ off?
	REG_NTRCARDMCNT = NTRCARD_CR1_ENABLE | NTRCARD_CR1_IRQ;
	REG_NTRCARDROMCNT = 0x20000000;
	TIMER_sleep(120);
}

bool gamecardInit(void)
{
	// No gamecard inserted.
	if(REG_CFG9_CARDSTATUS & 1) return false;

	resetCardslot();

	u32 cmdBuf[4] = {0};
	cmdBuf[0] = 0x9F000000; // Reset cmd
	ntrcardCommand(cmdBuf, 0x2000, NTRCARD_CLK_SLOW | NTRCARD_DELAY1(0x1FFF) | NTRCARD_DELAY2(0x18), NULL);

	// No idea what this is. Hardcoded in Process9.
	static const u32 unkGarbageCmd[2] = {0x71C93FE9, 0xBB0A3B18};
	ntrcardCommand(unkGarbageCmd, 0, NTRCARD_CLK_SLOW | NTRCARD_DELAY1(0x1FFF) | NTRCARD_DELAY2(0x18), NULL);

	// Send the get chip ID cmd twice like Process9.
	cmdBuf[0] = 0x90000000; // Get chip ID cmd
	ntrcardCommand(cmdBuf, 4, NTRCARD_CLK_SLOW | NTRCARD_DELAY1(0x1FFF) | NTRCARD_DELAY2(0x18), &chipId);
	ntrcardCommand(cmdBuf, 4, NTRCARD_CLK_SLOW | NTRCARD_DELAY1(0x1FFF) | NTRCARD_DELAY2(0x18), NULL);

	if(chipId & 0x10000000)
	{
		cmdBuf[0] = 0xA0000000; // Get card type cmd
		ntrcardCommand(cmdBuf, 4, 0, &cardType);

		cmdBuf[0] = 0x3E000000; // Enter 16 byte mode cmd
		ntrcardCommand(cmdBuf, 0, 0, NULL);

		// Switch to CTRCARD controller.
		REG_CTRCARDCNT = CTRCARD_nRESET;
		REG_CFG9_CARDCTL = 2;

		cmdBuf[0] = 0x82000000; // Read header cmd
		ctrcardCommand(cmdBuf, 0x200, 1, 0x704802C, &ctrcardHeader);

		// Check if the header is ok.
		if(memcmp(&ctrcardHeader.magic, "NCCH", 4) != 0) return false;

		// The secure init function sets the cmdRand* words in cmdBuf for us.
		if(!ctrcardSecureInit(cmdBuf)) return false;

		u32 chipIdTest, cardTypeTest;
		cmdBuf[0] = 0xA2000000; // Get secure chip ID cmd
		ctrcardCommand(cmdBuf, 4, 1, 0x701002C, &chipIdTest);
		cmdBuf[0] = 0xA3000000; // Get secure card type cmd
		ctrcardCommand(cmdBuf, 4, 1, 0x701002C, &cardTypeTest);

		if(chipIdTest == chipId && cardTypeTest == cardType)
		{
			cmdBuf[0] = 0xC5000000; // Check status cmd
			ctrcardCommand(cmdBuf, 0, 1, 0x100002C, NULL);
		}

		cmdBuf[0] = 0xA2000000; // Get secure chip ID cmd
		for(u32 i = 0; i < 5; i++)
		{
			ctrcardCommand(cmdBuf, 4, 1, 0x701002C, NULL);
		}
	}

	return true;
}

u32 getChipId(void)
{
	return chipId;
}

u32 getCardType(void)
{
	return cardType;
}



//////////////////////////////////
//            NTRCARD           //
//////////////////////////////////

void ntrcardCommand(const u32 command[2], u32 pageSize, u32 latency, void* buffer)
{
	REG_NTRCARDMCNT = NTRCARD_CR1_ENABLE;

	REGs_NTRCARDCMD[0] = swap32(command[0]);
	REGs_NTRCARDCMD[1] = swap32(command[1]);

	pageSize -= pageSize & 3; // align to 4 byte

	u32 pageParam = NTRCARD_PAGESIZE_4K;
	u32 transferLength = 4096;

	// make zero read and 4 byte read a little special for timing optimization(and 512 too)
	switch (pageSize) {
		case 0:
			transferLength = 0;
			pageParam = NTRCARD_PAGESIZE_0;
			break;
		case 4:
			transferLength = 4;
			pageParam = NTRCARD_PAGESIZE_4;
			break;
		case 512:
			transferLength = 512;
			pageParam = NTRCARD_PAGESIZE_512;
			break;
		case 8192:
			transferLength = 8192;
			pageParam = NTRCARD_PAGESIZE_8K;
			break;
	default:
		break; //Using 4K pagesize and transfer length by default
	}

	// go
	REG_NTRCARDROMCNT = 0x10000000;
	REG_NTRCARDROMCNT = NTRKEY_PARAM | NTRCARD_ACTIVATE | NTRCARD_nRESET | pageParam | latency;

	u8 * pbuf = (u8 *)buffer;
	u32 * pbuf32 = (u32 * )buffer;
	bool useBuf = ( NULL != pbuf );
	bool useBuf32 = (useBuf && (0 == (3 & ((u32)buffer))));

	u32 count = 0;
	u32 cardCtrl = REG_NTRCARDROMCNT;

	if(useBuf32)
	{
		while( (cardCtrl & NTRCARD_BUSY) && count < pageSize)
		{
			cardCtrl = REG_NTRCARDROMCNT;
			if( cardCtrl & NTRCARD_DATA_READY  ) {
				u32 data = REG_NTRCARDFIFO;
				*pbuf32++ = data;
				count += 4;
			}
		}
	}
	else if(useBuf)
	{
		while( (cardCtrl & NTRCARD_BUSY) && count < pageSize)
		{
			cardCtrl = REG_NTRCARDROMCNT;
			if( cardCtrl & NTRCARD_DATA_READY  ) {
				u32 data = REG_NTRCARDFIFO;
				pbuf[0] = (unsigned char) (data >>  0);
				pbuf[1] = (unsigned char) (data >>  8);
				pbuf[2] = (unsigned char) (data >> 16);
				pbuf[3] = (unsigned char) (data >> 24);
				pbuf += sizeof (unsigned int);
				count += 4;
			}
		}
	}
	else
	{
		while( (cardCtrl & NTRCARD_BUSY) && count < pageSize)
		{
			cardCtrl = REG_NTRCARDROMCNT;
			if( cardCtrl & NTRCARD_DATA_READY  ) {
				u32 data = REG_NTRCARDFIFO;
				(void)data;
				count += 4;
			}
		}
	}

	// if read is not finished, ds will not pull ROM CS to high, we pull it high manually
	if( count != transferLength ) {
		// MUST wait for next data ready,
		// if ds pull ROM CS to high during 4 byte data transfer, something will mess up
		// so we have to wait next data ready
		do { cardCtrl = REG_NTRCARDROMCNT; } while(!(cardCtrl & NTRCARD_DATA_READY));
		// and this tiny delay is necessary
		//ioAK2Delay(33);
		// pull ROM CS high
		REG_NTRCARDROMCNT = 0x10000000;
		REG_NTRCARDROMCNT = NTRKEY_PARAM | NTRCARD_ACTIVATE | NTRCARD_nRESET; // | 0 | 0x0000;
	}
	// wait rom cs high
	do { cardCtrl = REG_NTRCARDROMCNT; } while( cardCtrl & NTRCARD_BUSY );
	//lastCmd[0] = command[0];lastCmd[1] = command[1];
}



//////////////////////////////////
//            CTRCARD           //
//////////////////////////////////

void ctrcardCommand(const u32 command[4], u32 pageSize, u32 blocks, u32 latency, void* buffer)
{
	REGs_CTRCARDCMD[0] = command[3];
	REGs_CTRCARDCMD[1] = command[2];
	REGs_CTRCARDCMD[2] = command[1];
	REGs_CTRCARDCMD[3] = command[0];

	//Make sure this never happens
	if(blocks == 0) blocks = 1;

	pageSize -= pageSize & 3; // align to 4 byte
	u32 pageParam = CTRCARD_PAGESIZE_4K;
	u32 transferLength = 4096;
	// make zero read and 4 byte read a little special for timing optimization(and 512 too)
	switch(pageSize) {
		case 0:
			transferLength = 0;
			pageParam = CTRCARD_PAGESIZE_0;
			break;
		case 4:
			transferLength = 4;
			pageParam = CTRCARD_PAGESIZE_4;
			break;
		case 64:
			transferLength = 64;
			pageParam = CTRCARD_PAGESIZE_64;
			break;
		case 512:
			transferLength = 512;
			pageParam = CTRCARD_PAGESIZE_512;
			break;
		case 1024:
			transferLength = 1024;
			pageParam = CTRCARD_PAGESIZE_1K;
			break;
		case 2048:
			transferLength = 2048;
			pageParam = CTRCARD_PAGESIZE_2K;
			break;
		case 4096:
			transferLength = 4096;
			pageParam = CTRCARD_PAGESIZE_4K;
			break;
	default:
		break; //Defaults already set
	}

	REG_CTRCARDBLKCNT = blocks - 1;
	transferLength *= blocks;

	// go
	REG_CTRCARDCNT = 0x10000000;
	REG_CTRCARDCNT = CTRCARD_ACTIVATE | CTRCARD_nRESET | pageParam | latency;

	u8 * pbuf = (u8 *)buffer;
	u32 * pbuf32 = (u32 * )buffer;
	bool useBuf = ( NULL != pbuf );
	bool useBuf32 = (useBuf && (0 == (3 & ((u32)buffer))));

	u32 count = 0;
	u32 cardCtrl = REG_CTRCARDCNT;

	if(useBuf32)
	{
		while( (cardCtrl & CTRCARD_BUSY) && count < transferLength)
		{
			cardCtrl = REG_CTRCARDCNT;
			if( cardCtrl & CTRCARD_DATA_READY  ) {
				u32 data = REG_CTRCARDFIFO;
				*pbuf32++ = data;
				count += 4;
			}
		}
	}
	else if(useBuf)
	{
		while( (cardCtrl & CTRCARD_BUSY) && count < transferLength)
		{
			cardCtrl = REG_CTRCARDCNT;
			if( cardCtrl & CTRCARD_DATA_READY  ) {
				u32 data = REG_CTRCARDFIFO;
				pbuf[0] = (unsigned char) (data >>  0);
				pbuf[1] = (unsigned char) (data >>  8);
				pbuf[2] = (unsigned char) (data >> 16);
				pbuf[3] = (unsigned char) (data >> 24);
				pbuf += sizeof (unsigned int);
				count += 4;
			}
		}
	}
	else
	{
		while( (cardCtrl & CTRCARD_BUSY) && count < transferLength)
		{
			cardCtrl = REG_CTRCARDCNT;
			if( cardCtrl & CTRCARD_DATA_READY  ) {
				u32 data = REG_CTRCARDFIFO;
				(void)data;
				count += 4;
			}
		}
	}

	// if read is not finished, ds will not pull ROM CS to high, we pull it high manually
	if( count != transferLength ) {
		// MUST wait for next data ready,
		// if ds pull ROM CS to high during 4 byte data transfer, something will mess up
		// so we have to wait next data ready
		do { cardCtrl = REG_CTRCARDCNT; } while(!(cardCtrl & CTRCARD_DATA_READY));
		// and this tiny delay is necessary
		wait(66);
		// pull ROM CS high
		REG_CTRCARDCNT = 0x10000000;
		REG_CTRCARDCNT = CTRKEY_PARAM | CTRCARD_ACTIVATE | CTRCARD_nRESET;
	}
	// wait rom cs high
	do { cardCtrl = REG_CTRCARDCNT; } while( cardCtrl & CTRCARD_BUSY );
	//lastCmd[0] = command[0];lastCmd[1] = command[1];
}

void ctrcardReadData(u32 sector, u32 length, u32 blocks, void* buffer)
{
	static u32 readCount = 0;
	u32 cmdBuf[4] = {0};

	if(readCount++ >= 10000)
	{
		cmdBuf[0] = 0xC5000000; // Check status cmd
		cmdBuf[2] = cmdRand1;
		cmdBuf[3] = cmdRand2;
		ctrcardCommand(cmdBuf, 0, 1, 0x100002C, NULL);
		cmdBuf[2] = 0;
		cmdBuf[3] = 0;
		readCount = 0;
	}

	// This may not be needed if gamecards >4 GiB don't exist.
	cmdBuf[0] = 0xBF000000 | sector>>23; // Read ROM cmd
	cmdBuf[1] = sector<<9;
	ctrcardCommand(cmdBuf, length, blocks, 0x704822C, buffer);

	// Card dummy
	/*cmdBuf[0] = 0xA2000000; // Get secure chip ID cmd
	cmdBuf[2] = cmdRand1;
	cmdBuf[3] = cmdRand2;
	ctrcardCommand(cmdBuf, 4, 1, 0x701002C, NULL);*/
}

NCCH_header* ctrcardGetHeader(void)
{
	return &ctrcardHeader;
}

void ctrcardGetUniqueId(u32 *buf)
{
	// TODO: Do we need cmdRand* here?
	const u32 uniqueIdCmd[4] = {0xC6000000, 0x00000000, 0x00000000, 0x00000000};
	ctrcardCommand(uniqueIdCmd, 0x40, 1, 0x701002C, buf);
}

static void ctrcardSetSecSeed(const u32 seed[4], bool flag)
{
	// Select secure crypto key?
	if(flag) REG_CTRCARDSECCNT = ((cardType & 3u) << 8) | 4;

	REG_CTRCARDSECSEED = seed[0];
	REG_CTRCARDSECSEED = seed[1];
	REG_CTRCARDSECSEED = seed[2];
	REG_CTRCARDSECSEED = seed[3];
	REG_CTRCARDSECCNT |= 0x8000;

	while(!(REG_CTRCARDSECCNT & 0x4000));

	if(flag) (*(vu32*)0x1000400C) = 0x00000001; // Enable cart command encryption?
}

static bool ctrcardSecureInit(u32 cmdBuf[4])
{
	if((cardType & 3u) == 3) // Dev card
	{
		static const u32 devCardKey[4] = {0};
		AES_setKey(0x11, AES_KEY_NORMAL, AES_INPUT_BIG | AES_INPUT_NORMAL, false, devCardKey);
		AES_selectKeyslot(0x11);
	}
	else // Retail card
	{
		AES_setKey(0x3B, AES_KEY_Y, AES_INPUT_BIG | AES_INPUT_NORMAL, false, ctrcardHeader.seedKeyY);
		AES_selectKeyslot(0x3B);
	}

	u32 seed[4];
	AES_ctx ctx;
	AES_setNonce(&ctx, AES_INPUT_BIG | AES_INPUT_NORMAL, ctrcardHeader.seedNonce);
	AES_setCryptParams(&ctx, AES_INPUT_BIG | AES_INPUT_NORMAL, AES_OUTPUT_LITTLE | AES_OUTPUT_REVERSED);
	if(!AES_ccm(&ctx, ctrcardHeader.encryptedSeed, seed, 16, ctrcardHeader.seedAesMac, 1, false)) return false;

	ctrcardSetSecSeed(seed, true);

	cmdRand1 = REG_PRNG[0];
	cmdRand2 = REG_PRNG[4];
	cmdBuf[0] = 0x83000000; // Seed cmd
	cmdBuf[2] = cmdRand1;
	cmdBuf[3] = cmdRand2;
	ctrcardCommand(cmdBuf, 0, 1, 0x700822C, NULL);

	seed[0] = cmdRand2;
	seed[1] = cmdRand1;
	ctrcardSetSecSeed(seed, false);

	return true;
}
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

nocash
Posts: 1229
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: 3DS reverse engineering

Post by nocash » Sat Mar 07, 2020 7:42 am

profi200 wrote:
Fri Mar 06, 2020 5:53 am
See this which implements the whole init almost identical to Process9. https://gist.github.com/profi200/ad0e88 ... 153d8edb60
Thanks, that's much easier to read than the godmode9 stuff, which has the cart functions scattered across at least 4 different source files.
I have tested some hardware behaviour, and found some additional details (and some questions):

Code: Select all

#define CTRCARD_PAGESIZE_16K (8u<<16)
#define CTRCARD_PAGESIZE_64K (9u<<16)
Those two are wrong, value 8 is only 8Kbytes, not 16Kbytes.
And everything in range 9..15 is only 8Kbytes, too.

Code: Select all

#define CTRCARD_CRC_ERROR    (1u<<4)
Everyone says so, but I really can't reproduce that.
I think that the CRC error flag is in bit8, not bit4. And CRC error checking is enabled in bit9. And, the error flag can be cleared/acknowledged by writing 0 to bit8.
Or is there anything indicating that bit4 would be CRC related?

I don't fully understand the CRC stuff...
Some commands do always trigger the error flag in bit8 (if enabled in bit9). I guess nintendo just didn't (properly) implement CRC's for all commands.
For command 83h (with 0 byte response), the crc error flag remains zero (unless specifying a response size bigger than 0 bytes, which triggers error).
Some of the old 8-byte NTRCARD commands seem to be also appending a 32bit CRC to the response data (only if there is any response). That value doesn't seem to be regular CRC32 though... I don't know if it's supposed to be calculated on CMD+Data, or only on Data, or something... and maybe it's garbage, or it's something different than CRC32, or the checksum is encrypted, too...?
So far, I haven't spotted a case there I could reproduce the checksum. Something like, if the data is "this", then the checksum must be "that".

Code: Select all

#define CTRKEY_PARAM         (0x1000000u)
Going by the wiki, that bit(s) should be transfer speed. I haven't verified that yet though.

Code: Select all

                ctrcardCommand(cmdBuf, 4, 1, 0x701002C, &chipIdTest);
The LSBs set to 2Ch in that command (and other commands)... what is that good for? Is that just what nintendo is doing?
Writing 00h to LSBs does work, too. Maybe there is some difference related to timings, or maybe DMA requests.
Bit5 isn't R/W, so writing 2Ch looks a bit weird. Unless bit5 is some write-only flag used to reset or apply something?
EDIT: Bit5 seems to get set on timeout, maybe writing 1 does reset the bit, and maybe some of the other bits select the timeout duration?

Code: Select all

                if(chipIdTest == chipId && cardTypeTest == cardType)
                {
                        cmdBuf[0] = 0xC5000000; // Check status cmd
                        ctrcardCommand(cmdBuf, 0, 1, 0x100002C, NULL);
                }
Why is that called "check status cmd"? It doesn't seem to check (=test) anything. Or is it mean to check (=confirm) something?
As far as I understand, command C5h must be sent once every 10,000 data read commands... and otherwise the cart returns wrong data?
Then I would call it "reset watchdog" or "reset crypto timebomb" or so.

Executing command C5h only if chipId/cardType do match... I guess that isn't really required for normal operation (assuming that chipId/cardType should always match).

Code: Select all

REG_CTRCARDCNT = 0x10000000;
That two writes would look nicer if they were called "REG_CTRCARDCNT = CTRCARD_nRESET;"
But that bit should be already set anyways, so the writes could be removed. Unless they supposed to clear the other bits (which normally shouldn't be required either). Anyways, doing the writes shouldn't disturb either.

Code: Select all

        // make zero read and 4 byte read a little special for timing optimization(and 512 too)
What is that optimization (or is it a planned optimization on the todo list)?

Code: Select all

        // if read is not finished, ds will not pull ROM CS to high, we pull it high manually
Is there a situation where that could happen for NTRCARD, or CTRCARD registers?
I guess the transfer should always transfer the expected number of words, even if the cartridge is ejected mid-transfer.
Or is there some "abort upon timeout" feature, or some "abort further blocks(s) upon crc-error" feature?
Hmmm, I guess NDS couldn't even detect cart eject at all... but DSi/3DS do have a eject switch... is the code related to that switch?

Code: Select all

        static u32 readCount = 0;
        u32 cmdBuf[4] = {0};
        if(readCount++ >= 10000)
        {
                cmdBuf[0] = 0xC5000000; // Check status cmd
When resetting/ejecting/inserting cartridges, readCount should be probably reset to zero, too.
Or well, if it is bigger than zero, then command C5h would be thrown a be earlier, which might be no problem at all(?)

Code: Select all

        if(flag) (*(vu32*)0x1000400C) = 0x00000001; // Enable cart command encryption?
No, that's something else: Writing 1 does disable writes to CTRCARD_CNT.bit28 (reset), and CTRCARD_SECCNT.bit0,1,2 (some of the encryption bits).
For now, I've called 1000400C register CTRCARD_LOCK. Once when set, the bit cannot be changed back from 1 to 0.
The main purpose seems to be to prevent disabling the encryption (assuming that encryption was enabled in SECCNT.bit2).

Code: Select all

        // TODO: Do we need cmdRand* here?
        const u32 uniqueIdCmd[4] = {0xC6000000, 0x00000000, 0x00000000, 0x00000000};
        ctrcardCommand(uniqueIdCmd, 0x40, 1, 0x701002C, buf);
What is that unique ID used for?
And how unique is it? Different for each game? Or even different for each cartridge, ie. containing some PROM with serial number?

PS.
The wiki says that there is also an 8-byte command with value 71C93FE9BB0A3B18 being sent to 3DS carts? Is that really done?
It seems to have no function, and seems to work without sending that command.

PPS.
And there are those CTRCARD1 registers at 10005000h. My current theory would be that they might be related to "NAND" carts with "CARD2" entry in the NCSD header? Could that be right?
The CTRCARD_SECCNT registers are different for CTRCARD0 and CTRCARD1, so the latter might be encrypted differently (or maybe completely unencrypted) (though I wouldn't know how the cartridge could distinguish between differently encrypted commands).

PPPS.
I think I have solved the meaning of the CGC and CGC_DET interrupt bits in REG_IE/IF registers.
CGC seems to be short for change-gamecart, CGC=Eject, and CGC_DET=Insert(ed).
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

profi200
Posts: 46
Joined: Fri May 10, 2019 4:48 am

Re: 3DS reverse engineering

Post by profi200 » Sat Mar 07, 2020 1:26 pm

Keep in mind this code is based on old code from Normmatt. And he got it from Wood/Akaio it seems (Flashcard stuff). The CTRCARD cmd code is basically ported from the NTRCARD code. I wanted to rewrite it at some point to use DMA. Never found the motivation/time to do it.
nocash wrote:
Sat Mar 07, 2020 7:42 am

Code: Select all

#define CTRCARD_PAGESIZE_16K (8u<<16)
#define CTRCARD_PAGESIZE_64K (9u<<16)
Those two are wrong, value 8 is only 8Kbytes, not 16Kbytes.
And everything in range 9..15 is only 8Kbytes, too.
Seems to be right. Looks like page size is only 4 bits (16-19). And then there are another 3 bits after (24-26).
nocash wrote:
Sat Mar 07, 2020 7:42 am

Code: Select all

#define CTRCARD_CRC_ERROR    (1u<<4)
Everyone says so, but I really can't reproduce that.
I think that the CRC error flag is in bit8, not bit4. And CRC error checking is enabled in bit9. And, the error flag can be cleared/acknowledged by writing 0 to bit8.
Or is there anything indicating that bit4 would be CRC related?
No, that seems to be correct.
nocash wrote:
Sat Mar 07, 2020 7:42 am

Code: Select all

                ctrcardCommand(cmdBuf, 4, 1, 0x701002C, &chipIdTest);
The LSBs set to 2Ch in that command (and other commands)... what is that good for? Is that just what nintendo is doing?
Writing 00h to LSBs does work, too. Maybe there is some difference related to timings, or maybe DMA requests.
Bit5 isn't R/W, so writing 2Ch looks a bit weird. Unless bit5 is some write-only flag used to reset or apply something?
EDIT: Bit5 seems to get set on timeout, maybe writing 1 does reset the bit, and maybe some of the other bits select the timeout duration?
It's the exact CNT values Nintendo uses in P9.
nocash wrote:
Sat Mar 07, 2020 7:42 am

Code: Select all

                if(chipIdTest == chipId && cardTypeTest == cardType)
                {
                        cmdBuf[0] = 0xC5000000; // Check status cmd
                        ctrcardCommand(cmdBuf, 0, 1, 0x100002C, NULL);
                }
Why is that called "check status cmd"? It doesn't seem to check (=test) anything. Or is it mean to check (=confirm) something?
As far as I understand, command C5h must be sent once every 10,000 data read commands... and otherwise the cart returns wrong data?
Then I would call it "reset watchdog" or "reset crypto timebomb" or so.

Executing command C5h only if chipId/cardType do match... I guess that isn't really required for normal operation (assuming that chipId/cardType should always match).
The idea behind the name is this is probably the equivalent of a NTRCARD cmd i can't find on gbatek right now. If i recall correctly Normmatt mentioned it and it is sent every X reads just like this CTRCARD cmd. I don't know what its purpose really is to be honest.
And again it's trying to mimic P9 to make sure dumping works flawless. You won't believe how picky these gamecards are. If you look at them wrong they will return garbage or corrupted data.
nocash wrote:
Sat Mar 07, 2020 7:42 am

Code: Select all

REG_CTRCARDCNT = 0x10000000;
That two writes would look nicer if they were called "REG_CTRCARDCNT = CTRCARD_nRESET;"
But that bit should be already set anyways, so the writes could be removed. Unless they supposed to clear the other bits (which normally shouldn't be required either). Anyways, doing the writes shouldn't disturb either.

Code: Select all

        // make zero read and 4 byte read a little special for timing optimization(and 512 too)
What is that optimization (or is it a planned optimization on the todo list)?

Code: Select all

        // if read is not finished, ds will not pull ROM CS to high, we pull it high manually
Is there a situation where that could happen for NTRCARD, or CTRCARD registers?
I guess the transfer should always transfer the expected number of words, even if the cartridge is ejected mid-transfer.
Or is there some "abort upon timeout" feature, or some "abort further blocks(s) upon crc-error" feature?
Hmmm, I guess NDS couldn't even detect cart eject at all... but DSi/3DS do have a eject switch... is the code related to that switch?
Dunno. Mostly code from Normmatt.
nocash wrote:
Sat Mar 07, 2020 7:42 am

Code: Select all

        static u32 readCount = 0;
        u32 cmdBuf[4] = {0};
        if(readCount++ >= 10000)
        {
                cmdBuf[0] = 0xC5000000; // Check status cmd
When resetting/ejecting/inserting cartridges, readCount should be probably reset to zero, too.
Or well, if it is bigger than zero, then command C5h would be thrown a be earlier, which might be no problem at all(?)
Good catch. I don't know if this will cause any issues. If i recall correctly it doesn't break protocol encryption if sent at the wrong time.
nocash wrote:
Sat Mar 07, 2020 7:42 am

Code: Select all

        if(flag) (*(vu32*)0x1000400C) = 0x00000001; // Enable cart command encryption?
No, that's something else: Writing 1 does disable writes to CTRCARD_CNT.bit28 (reset), and CTRCARD_SECCNT.bit0,1,2 (some of the encryption bits).
For now, I've called 1000400C register CTRCARD_LOCK. Once when set, the bit cannot be changed back from 1 to 0.
The main purpose seems to be to prevent disabling the encryption (assuming that encryption was enabled in SECCNT.bit2).
Seems to be correct.
nocash wrote:
Sat Mar 07, 2020 7:42 am

Code: Select all

        // TODO: Do we need cmdRand* here?
        const u32 uniqueIdCmd[4] = {0xC6000000, 0x00000000, 0x00000000, 0x00000000};
        ctrcardCommand(uniqueIdCmd, 0x40, 1, 0x701002C, buf);
What is that unique ID used for?
And how unique is it? Different for each game? Or even different for each cartridge, ie. containing some PROM with serial number?
It's per card and it's required for online gaming.
nocash wrote:
Sat Mar 07, 2020 7:42 am
PS.
The wiki says that there is also an 8-byte command with value 71C93FE9BB0A3B18 being sent to 3DS carts? Is that really done?
It seems to have no function, and seems to work without sending that command.
Line 82 in gamecard.c. Possibly just some garbage to "fix" timing problems.
nocash wrote:
Sat Mar 07, 2020 7:42 am
PPS.
And there are those CTRCARD1 registers at 10005000h. My current theory would be that they might be related to "NAND" carts with "CARD2" entry in the NCSD header? Could that be right?
The CTRCARD_SECCNT registers are different for CTRCARD0 and CTRCARD1, so the latter might be encrypted differently (or maybe completely unencrypted) (though I wouldn't know how the cartridge could distinguish between differently encrypted commands).
The second CTRCARD registers are maybe leftovers just like on DSi. They are never used by anything as far as i know. And CARD2 is handled the same as CARD1 with the only difference being a savegame area inside the ROM. Bit 29 in the CNT reg makes the FIFO writable instead of readable if i recall correctly. Cmds for writing this area are unknown.

Another note: CNT bits 0-3 seem to be for timeout. 4 is timeout error and 5 timeout enable.

nocash
Posts: 1229
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: 3DS reverse engineering

Post by nocash » Sun Mar 08, 2020 9:48 am

profi200 wrote:
Sat Mar 07, 2020 1:26 pm
(...command C5h...) The idea behind the name is this is probably the equivalent of a NTRCARD cmd i can't find on gbatek right now. If i recall correctly Normmatt mentioned it and it is sent every X reads just like this CTRCARD cmd. I don't know what its purpose really is to be honest.
I don't think that there is any such thing in gbatek. There might be a similar anti-piracy(?) feature in some NDS/DSi carts, but I've never heard anything about such carts. Some NDS games are occassionally sending the GetChipID command to detect cart eject, but that's rather something else.
profi200 wrote:
Sat Mar 07, 2020 1:26 pm
(...unique ID...) It's per card and it's required for online gaming.
Okay. I've dumped the ID from my cartridge. It contains a 10h byte ID, plus 30h byte FFh-filled. The ID looks like "random" (no obvious ascii, timestamp, barcode or so). Hmmm, with only 16 bytes used it looks too small for a ECDSA signature (which might have been useful for blacklisting known pirate copies). For online gaming they could as well store a ID in SPI-FLASH, no idea why they needed the PROM for that.
profi200 wrote:
Sat Mar 07, 2020 1:26 pm
(...71C93FE9BB0A3B18...) Line 82 in gamecard.c. Possibly just some garbage to "fix" timing problems.
Ah, I didn't see that line/comment. Yeah, looks like garbage, or the exact number is needed for some special cartridges.
profi200 wrote:
Sat Mar 07, 2020 1:26 pm
The second CTRCARD registers are maybe leftovers just like on DSi. They are never used by anything as far as i know. And CARD2 is handled the same as CARD1 with the only difference being a savegame area inside the ROM. Bit 29 in the CNT reg makes the FIFO writable instead of readable if i recall correctly. Cmds for writing this area are unknown.
No, the DSi's second cart slot is quite different: The DSi did have separate "NTRCARD" registers at 40001A0h and 40021A0h, and separate power/eject bits in SCFG_MC, but that's all gone in 3DS. And, 3DS doesn't have a second SPI_CARD bus. And, the CTRCARD1 registers are actually wired to the normal cart slot (the unencrypted command 82h works okay; CTRCARD1 just behaves differently upon encrypted commands).
profi200 wrote:
Sat Mar 07, 2020 1:26 pm
Another note: CNT bits 0-3 seem to be for timeout. 4 is timeout error and 5 timeout enable.
Not on my hardware. On my New3DS-XL it is: Timeout duration is in bit0-4, timeout error in bit5, and timeout enable in bit6.

I have tracked down most bits in CTRCARD_CNT register:

Code: Select all

10004000h - CTRCARD_CNT
  0-4   Timeout (0-16=1ms,2ms,4ms,8ms,..,64s; 17-31=64s, too; def=12=4s)  (R/W)
  5     Timeout Error      (0=Okay, 1=Error) (write 0 to ack)           (R/ack)
  6     Timeout Enable     (0=Disable, 1=Enable)                          (R/W)
  7     Unused (0)
  8     CRC Error          (0=Okay, 1=Error) (write 0 to ack)           (R/ack)
  9     CRC Enable         (0=Disable, 1=Enable) (fails on some cmd's?)   (R/W)
  10-14 Unused (0)
  15    unknown... ?                                                      (R/W)
  16-19 Data Block size    (0-8=0,4,16,64,512,1K,2K,4K,8K; 9-15=8K, too)  (R/W)
  20-23 Unused (0)
  24-26 Tiny delay... per chip select or so? (0-3=Fast, 4=Med, 5-7=Slow)  (R/W)
  27    Data-Word status   (0=Busy, 1=Ready/DRQ)                            (R)
  28    Reset Pin          (0=Low/Reset, 1=High/Release) (SET-ONCE)       (R/W)
  29    Transfer Direction (0=Read, 1=Write)                              (R/W)
  30    Interrupt Enable   (0=Disable, 1=Enable) (ARM9 IF.bit23/24)       (R/W)
  31    Start              (0=Idle, 1=Start/Busy)                         (R/W)
Any idea what bit15 is good for?

I am not sure what the delay in bit24-26 is doing. There seem to be only 3 different settings encoded in that 3bits. The slow delay adds about 150h cycles (at 67MHz) to the transfer time, and the medium delay adds only 30h cycles. That is for the total transfer (regardless of the number of bytes), so the delay appears to occur somewhere before/after chip select, or before data block(s). At the moment I don't have an oscilloscope at hand for checking what is going on there.

There seems to be no setting to change the transfer clock (at least not in the CTRCARD_CNT register). That's a bit weird because even old NDS consoles did support two different clocks, I would have thought that 3DS would support that, too (and perhaps more/faster clocks).
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

nocash
Posts: 1229
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: 3DS reverse engineering

Post by nocash » Mon Mar 09, 2020 3:31 pm

I have rev-engineered some more of the cartridge related 3DS CONFIG9 registers (and the equivalent DSi SCFG registers). Especially the two 16bit registers with cartridge timings are now known, one is a cart-insert-detection-delay, and the other is a power-off-delay.
Alongsides, I have also updated the power-on and power-off sequences. State=1 (what I had originally called "PreparePowerOn") is now called "PowerOn+Reset". That means, one can just reset the cartridge by using state=1 then state=2 (without needing the nintendo-style code that first goes through the power-off sequence). And, when going through the power-off sequence for whatever reason, one could speedup the ending wait (either by changing the PWROFF_DELAY register, or by just not executing the wait).

Code: Select all

4004010h - DSi9 - SCFG_MC - NDS Slot Memory Card Interface Status (R)
4004010h - DSi7 - SCFG_MC - NDS Slot Memory Card Interface Control (R/W)
  0     1st NDS Slot Game Cartridge (0=Inserted, 1=Ejected)               (R)
  1     1st NDS Slot Unknown/Unused (0)
  2-3   1st NDS Slot Power State (0=Off, 1=On+Reset, 2=On, 3=RequestOff)  (R/W)
  4     2nd NDS Slot Game Cartridge (always 1=Ejected) ;\DSi              (R)
  5     2nd NDS Slot Unknown/Unused (0)                ; prototype
  6-7   2nd NDS Slot Power State    (always 0=Off)     ;/relict           (R/W)
  8-14  Unknown/Undocumented (0)
  15    Swap NDS Slots (0=Normal, 1=Swap)                                 (R/W)
  16-31 ARM7: See Port 4004012h, ARM9: Unspecified (0)
Note: Additionally, the NDS slot Reset pin can be toggled (via ROMCTRL.Bit29;
that bit is writeable on ARM7 side on DSi; which wasn't supported on NDS).
Power state values:
  0=Power is Off
  1=Power On and force Reset (shall be MANUALLY changed to state=2)
  2=Power On
  3=Request Power Off (will be AUTOMATICALLY changed to state=0)
cart_power_on: (official/insane 1+10+27+120ms, but also works with 1+1+0+1ms)
  wait until state<>3                   ;wait if pwr off busy
  exit if state<>0 AND no_reset_wanted  ;exit if already on & no reset wanted
  wait 1ms, then set state=1            ;pwr on & force reset
  wait 10ms, then set state=2           ;pwr on normal state   ;better: 1ms
  wait 27ms, then set ROMCTRL=20000000h ;release reset pin     ;better: 0ms
  wait 120ms                            ;more insane delay?    ;better: 1ms
cart_power_off: (official 150ms, when using default [4004014h]=264Ch)
  wait until state<>3                   ;wait if pwr off busy
  exit if state<>2                      ;exit if already off
  set state=3                           ;request pwr off
  exit unless you want to know when below pointless delay has ellapsed
  wait until state=0    ;default=150ms  ;wait until pwr off    ;better: skip
Power Off is also done automatically by hardware when ejecting the cartridge.
The Power On sequence does reset ROMCTRL.bit29=0 (reset signal).
Bit15 swaps ports 40001A0h-40001BFh and 4100010h with 40021A0h-40021BFh? and
4102010h?, the primary purpose is mapping the 2nd Slot to the 4xx0xxxh
registers (for running carts in 2nd slot in NDS mode; which of course doesn't
work because the 2nd slot connector isn't installed), theoretically it would
also allow to access the 1st slot via 4xx2xxxh registers (however, that doesn't
seem to be fully implemented, cart reading does merely reply FFh's (cart
inserted) or 00h's (no cart)). 4102010h can be read by manually polling DRQ in
40021A4h.bit23, and probably by NDMA (but not by old DMA which has no known DRQ
mode for 2nd slot).

4004012h - DSi7 - SCFG_CARD_INSERT_DELAY (usually 1988h = 100ms) (R/W)
4004014h - DSi7 - SCFG_CARD_PWROFF_DELAY (usually 264Ch = 150ms) (R/W)
  0-15  Delay in 400h cycle units (at 67.027964MHz)  ;max FFFFh=ca. 1 second
Usually set to 1988h/264Ch by firmware. Power up default is FFFFh/FFFFh, with
that setting it takes about 1 second to sense inserted carts (after power-up,
or after cart-insert).
Insert delay defines the time until SCFG_MC.bit0 reacts upon cart insert, and
until cart access works (cart eject works instantly regardless of the delay
setting). The 3DS bootrom uses 051Dh=20ms, which may be good to avoid switch
bounce. The DSi/3DS firmwares use 1988h=100ms, which might be reasonable if the
switch triggers too early.
Power Off delay defines how fast SCGF_MC.bit2-3 are changing from state=3
(request power off) to state=0 (power off). The 3DS bootrom uses 051Dh=20ms,
the DSi/3DS firmwares use 264Ch=150ms, which is both kinda pointless. If the
VCC pin is kept powered during that time (?) then the delay might help to
finish FLASH writes (which would make sense only if there was data written to
cartridge).
Some things that might be useful for fine-tuning the github code:

The INSERT_DELAY should be initialized before checking if the cart is inserted, with default value of FFFFh it would take 1 second until cart-insert is sensed. As far as I remember one must still wait a few hundred cycles after initializing INSERT_DELAY (probably 400h ARM9 cycles), and before checking the cart-insert flag. Ie. ideally init INSERT_DELAY early in your boot code, or otherwise wait 400h cycles.

The cart power on function doesn't need to request power-off. The following line should be kept there, but changed to wait until state<>3 (instead of waiting for state=0), so it will wait only power-off was busy. Most of the official millisecond delays are ways too long. For example, the reset pin would probably react with microseconds, not milliseconds. Of course, the long 120ms delay is worst of all... I got away with 1ms with my NDS/DSi carts... but I don't know if there are any crappy flashcarts or some odd official carts that might require longer delays before responding to commands...?
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

nocash
Posts: 1229
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: 3DS reverse engineering

Post by nocash » Tue Mar 10, 2020 11:30 am

And the CARD_CTL register on 3DS, including the bit-combinations for the three SPI-bus modes...

Code: Select all

1000000Ch - CFG9_CARD_CTL (R/W)
  0-1   Gamecard ROM controller  (0=NTRCARD, 1=?, 2=CTRCARD0, 3=CTRCARD1)
  2-3   Unused (0)
  4     Gamecard SPI_CARD mode   (0=Manual, 1=FIFO)
  5-7   Unused (0)
  8     Gamecard SPI controller  (0=NTRCARD, 1=SPI_CARD)
  9-11  Unused (0)
  12    Unknown...? (R/W)
  13-15 Unused (0)
There are three controllers for ROM cartridge commands:
  xxx0h/xxx1h? NTRCARD  (8-byte commands)  (bit1=0)         (Port 10164000h)
  xxx2h        CTRCARD0 (16-byte commands) (bit1=1, bit0=0) (Port 10004000h)
  xxx3h        CTRCARD1 (16-byte commands) (bit1=1, bit0=1) (Port 10005000h)
And three controllers for SPI-bus cartridge savedata:
  x0x0h/x0x1h  NTRCARD Manual NDS-style    (bit8=0, bit1=0) (Port 10164000h)
  x0x2h/x0x3h  None                        (bit8=0, bit1=1) (N/A)
  x10xh        SPI_CARD in Manual Mode     (bit8=1, bit4=0) (Port 1000D000h)
  x11xh        SPI_CARD in FIFO Mode       (bit8=1, bit4=1) (Port 1000D800h)
The deselected controllers are disconnected from the cartridge bus (and tend
to return data=FFh when trying to read from the cartridge).
That leaves only bit12 unknown, I haven't noticed any different behaviour when setting or clearing that bit.
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

profi200
Posts: 46
Joined: Fri May 10, 2019 4:48 am

Re: 3DS reverse engineering

Post by profi200 » Thu Mar 12, 2020 9:53 am

nocash wrote:
Sun Mar 08, 2020 9:48 am
Any idea what bit15 is good for?

I am not sure what the delay in bit24-26 is doing. There seem to be only 3 different settings encoded in that 3bits. The slow delay adds about 150h cycles (at 67MHz) to the transfer time, and the medium delay adds only 30h cycles. That is for the total transfer (regardless of the number of bytes), so the delay appears to occur somewhere before/after chip select, or before data block(s). At the moment I don't have an oscilloscope at hand for checking what is going on there.

There seems to be no setting to change the transfer clock (at least not in the CTRCARD_CNT register). That's a bit weird because even old NDS consoles did support two different clocks, I would have thought that 3DS would support that, too (and perhaps more/faster clocks).
Bit 15 might be for DMA requests (startup) similar to SHA where you can disable them. Not sure.

I have a $20 oscilloscope (lol) but it's just useless for this, since it is getting wonky with 200 KHz signals (probably around 2 MHz sampling rate). The logic analyzer has 24 MHz sampling rate but it's not as accurate as an oscilloscope and 24 MHz is too slow for 16 MHz signals.

Clock is probably fixed for the CTRCARD controller. It doesn't need to support anything else anyway.


Good to finally know what these "CARDCYCLES" regs do. I have just copied the values from P9 for them. I would recommend you not change these delays because we had lots of troubles with genuine cards and flashcards. The GodMode9 authors know what i mean. They had to tweak code so many times because some cards just would not init properly. For completeness sake the last (software) delay on slot poweron is 270 ms in P9. That's one of the few i changed. As i mentioned, these gamecards will return garbage if you look at them wrong. Even directly after slot poweron.

CFG9_CARD_CTL bit 12 could be IRQ related maybe to route either SPI or gamecard controller IRQ to the same IRQ line?.

nocash
Posts: 1229
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: 3DS reverse engineering

Post by nocash » Fri Mar 13, 2020 2:04 pm

Yeah, CARD_CTL bit12 might be IRQ related, I haven't tested IRQs yet. On the other hand, with the same pins being shared for SPI and ROM access, only either one can be used at once (so it wouldn't make too much sense to suppress SPI IRQs during ROM reads).

I added NDMA support today.
Yes, CTRCARD_CNT bit15 is really DMA enable.
The FIFO block size is 8 words (unlike NTRCARD which doesn't have a FIFO, so only 1 word at a time there).
Using NDMA logical block size of 8 words does also work when the NDMA total length is only 1..7 words (it will automatically transfer only that much data).
For length of 0 words, one should avoid starting NDMA (since it would treat that as very large value).

With the NDMA, I am now also getting much faster transfers, and CTRCARD_CNT bit24-26 allow to select different transfer clocks. I have measured timings for 200h, 2000h and 4 byte transfers. The timings include the command bytes, status bytes, and data bytes. The difference between 4 byte and 2000h byte transfers allows to get an idea of transfer clock per byte:

Code: Select all

     ;orr r4,0000000h   ;time 19F8h per 200h, and 8D9Ah per 2000h, D81h per 4      ;8000h/2000h  = 4   ;67.027964MHz/4 = 16.7MHz
     ;orr r4,1000000h   ;time 1E73h per 200h, and B0E6h per 2000h, 10D8h per 4     ;A000h/2000h  = 5   ;67.027964MHz/5 = 13.4MHz
     ;orr r4,2000000h   ;time 22E4h per 200h, and D409h per 2000h, 13F6h per 4     ;C000h/2000h  = 6   ;67.027964MHz/6 = 11.2MHz
     ;orr r4,3000000h   ;time 2B96h per 200h, and 11A5Bh per 2000h, 1A6Eh per 4    ;10000h/2000h = 8   ;67.027964MHz/8 = 8.4MHz
     ;orr r4,4000000h   ;time 3484h per 200h, and 160D5h per 2000h, 211Fh per 4    ;14000h/2000h = 10  ;67.027964MHz/10 = 6.7MHz
     ;orr r4,5000000h   ;time 4F12h per 200h, and 23451h per 2000h, 3489h per 4    ;20000h/2000h = 16  ;67.027964MHz/16 = 4.2MHz
     ;orr r4,6000000h   ;time 4F0Eh per 200h, and 23451h per 2000h, 3489h per 4    ;same as above       ;EDIT: fixed above "/N" dividers
     ;orr r4,7000000h   ;time 4F0Eh per 200h, and 23452h per 2000h, 3489h per 4    ;same as above       ;ie. "/4..16" instead of "/4"
     ;non-dma           ;time 4xxxh per 200h, and 2F5FBh per 2000h... hmmm, there must be something wrong in my non-dma reading function
So, all bits in CTRCARD_CNT are now known:

Code: Select all

10004000h/10005000h - CTRCARD_CNT
  0-4   Timeout (0-16=1ms,2ms,4ms,8ms,..,64s; 17-31=64s, too; def=12=4s)  (R/W)
  5     Timeout Error      (0=Okay, 1=Error) (write 0 to ack)           (R/ack)
  6     Timeout Enable     (0=Disable, 1=Enable)                          (R/W)
  7     Unused (0)
  8     CRC Error          (0=Okay, 1=Error) (write 0 to ack)           (R/ack)
  9     CRC Enable         (0=Disable, 1=Enable) (works for cmd 82h/BFh)  (R/W)
  10-14 Unused (0)
  15    DMA Enable         (0=Disable, 1=Enable DMA DRQs, each 8 words)   (R/W)
  16-19 Data Block size    (0-8=0,4,16,64,512,1K,2K,4K,8K; 9-15=8K, too)  (R/W)
  20-23 Unused (0)
  24-26 Byte Transfer Time (0-5=4,5,6,8,10,16 clks at 67MHz; 6-7=16, too) (R/W)
  27    Data-Word status   (0=Busy, 1=Ready/DRQ)                            (R)
  28    Reset Pin          (0=Low/Reset, 1=High/Release) (SET-ONCE)       (R/W)
  29    Transfer Direction (0=Read, 1=Write)                              (R/W)
  30    Interrupt Enable   (0=Disable, 1=Enable) (ARM9 IF.bit23/24)       (R/W)
  31    Start              (0=Idle, 1=Start/Busy)                         (R/W)
Looking at your code, you have three different transfer clocks. Using a slow clock for reading the cart header makes sense, I guess there might be a header entry that indicates the supported speed, similar as in NDS carts(?), and then everything else should probably use that clock value.
At the moment, you have a slow clock for the 200h-byte data reads, and faster (and much faster) clocks for some of the 4-byte/0-byte commands. But well, maybe that's right... the ROM lookup time is indicated by the status bytes, but the following data stream might also have a restriction if the ROM is kinda slow.

Without NDMA, I always get less than 4.2MHz per byte, ie. less than 1MHz per word... I don't understand how my software data polling function could be that slow... there seems to be something wrong with my ARM9 CPU clock in general...
I have ARM9 code in AXI RAM, and ARM9 code cache enabled (although without PU), in theory, ARM9 should then run at 67MHz, shouldn't it?
But it looks more as if it is only running at about 4MHz or the like. I am having that problem elsewhere, too: I have a hardcoded keypad repeat delay, that works fine on ARM11 at 268MHz, of course it would be 4x slower on ARM9... but it looks more like being 40x slower : /

Is there are control register that could make the ARM9 run that slow?
Or do I need the PU for code cache on ARM9?
One ARM11 the code cache works without MMU.
Last edited by nocash on Tue Mar 17, 2020 6:09 am, edited 1 time in total.
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

nocash
Posts: 1229
Joined: Fri Feb 24, 2012 12:09 pm
Contact:

Re: 3DS reverse engineering

Post by nocash » Fri Mar 13, 2020 2:31 pm

profi200 wrote:
Thu Mar 12, 2020 9:53 am
Good to finally know what these "CARDCYCLES" regs do. I have just copied the values from P9 for them. I would recommend you not change these delays because we had lots of troubles with genuine cards and flashcards.
No risk, no fun ; )
In this case, the delays seem to occur before/after powering the card VCC pin, so the delays could hardly affect the card operation. Or only indirectly:
PowerOff delay might be needed for write-completion, IF there is any such feature, and IF anything was written.
A (very) short Insert delay could trigger switch bounce on the insert/eject signal.
If the user does insert the card slowly, then the insert flag may trigger before the card is fully inserted. But that does also happen with the official delay setting. The best workaround would be to do some Reset+Retries if the card replies with different chip IDs or other corrupt data during card initialization.

I have also had a short look at the encryption. Sending a nonsense command (which should return FFh's) and setting the seed to all 00h's or all FFh's did still return heavily encrypted data (instead of plain FFh's). Changing only a single seed bit does seem to affect most or all data bytes. So it's apparently really something like AES, the key-init-busy-flag does also indicate that there is some AES-style key schedule going on.
One small detail is that CTRCNT_SECCNT bit0-1 set to 2 does cause the hardware to ignore the written seed value. It does completely ignore the whole seed, including 1st and 4th word (ie. those bits don't seem to be for AES key/iv size selection).
homepage - patreon - you can think of a bit as a bottle that is either half full or half empty

profi200
Posts: 46
Joined: Fri May 10, 2019 4:48 am

Re: 3DS reverse engineering

Post by profi200 » Sat Mar 14, 2020 5:31 am

nocash wrote:
Fri Mar 13, 2020 2:04 pm
Yeah, CARD_CTL bit12 might be IRQ related, I haven't tested IRQs yet. On the other hand, with the same pins being shared for SPI and ROM access, only either one can be used at once (so it wouldn't make too much sense to suppress SPI IRQs during ROM reads).

I added NDMA support today.
Yes, CTRCARD_CNT bit15 is really DMA enable.
The FIFO block size is 8 words (unlike NTRCARD which doesn't have a FIFO, so only 1 word at a time there).
Using NDMA logical block size of 8 words does also work when the NDMA total length is only 1..7 words (it will automatically transfer only that much data).
For length of 0 words, one should avoid starting NDMA (since it would treat that as very large value).

With the NDMA, I am now also getting much faster transfers, and CTRCARD_CNT bit24-26 allow to select different transfer clocks. I have measured timings for 200h, 2000h and 4 byte transfers. The timings include the command bytes, status bytes, and data bytes. The difference between 4 byte and 2000h byte transfers allows to get an idea of transfer clock per byte:

Code: Select all

     ;orr r4,0000000h   ;time 19F8h per 200h, and 8D9Ah per 2000h, D81h per 4      ;8000h/2000h  = 4   ;67.027964MHz/4 = 16.7MHz
     ;orr r4,1000000h   ;time 1E73h per 200h, and B0E6h per 2000h, 10D8h per 4     ;A000h/2000h  = 5   ;67.027964MHz/4 = 13.4MHz
     ;orr r4,2000000h   ;time 22E4h per 200h, and D409h per 2000h, 13F6h per 4     ;C000h/2000h  = 6   ;67.027964MHz/4 = 11.2MHz
     ;orr r4,3000000h   ;time 2B96h per 200h, and 11A5Bh per 2000h, 1A6Eh per 4    ;10000h/2000h = 8   ;67.027964MHz/4 = 8.4MHz
     ;orr r4,4000000h   ;time 3484h per 200h, and 160D5h per 2000h, 211Fh per 4    ;14000h/2000h = 10  ;67.027964MHz/4 = 6.7MHz
     ;orr r4,5000000h   ;time 4F12h per 200h, and 23451h per 2000h, 3489h per 4    ;20000h/2000h = 16  ;67.027964MHz/4 = 4.2MHz
     ;orr r4,6000000h   ;time 4F0Eh per 200h, and 23451h per 2000h, 3489h per 4    ;same as above
     ;orr r4,7000000h   ;time 4F0Eh per 200h, and 23452h per 2000h, 3489h per 4    ;same as above
     ;non-dma           ;time 4xxxh per 200h, and 2F5FBh per 2000h... hmmm, there must be something wrong in my non-dma reading function
So, all bits in CTRCARD_CNT are now known:

Code: Select all

10004000h/10005000h - CTRCARD_CNT
  0-4   Timeout (0-16=1ms,2ms,4ms,8ms,..,64s; 17-31=64s, too; def=12=4s)  (R/W)
  5     Timeout Error      (0=Okay, 1=Error) (write 0 to ack)           (R/ack)
  6     Timeout Enable     (0=Disable, 1=Enable)                          (R/W)
  7     Unused (0)
  8     CRC Error          (0=Okay, 1=Error) (write 0 to ack)           (R/ack)
  9     CRC Enable         (0=Disable, 1=Enable) (works for cmd 82h/BFh)  (R/W)
  10-14 Unused (0)
  15    DMA Enable         (0=Disable, 1=Enable DMA DRQs, each 8 words)   (R/W)
  16-19 Data Block size    (0-8=0,4,16,64,512,1K,2K,4K,8K; 9-15=8K, too)  (R/W)
  20-23 Unused (0)
  24-26 Byte Transfer Time (0-5=4,5,6,8,10,16 clks at 67MHz; 6-7=16, too) (R/W)
  27    Data-Word status   (0=Busy, 1=Ready/DRQ)                            (R)
  28    Reset Pin          (0=Low/Reset, 1=High/Release) (SET-ONCE)       (R/W)
  29    Transfer Direction (0=Read, 1=Write)                              (R/W)
  30    Interrupt Enable   (0=Disable, 1=Enable) (ARM9 IF.bit23/24)       (R/W)
  31    Start              (0=Idle, 1=Start/Busy)                         (R/W)
Looking at your code, you have three different transfer clocks. Using a slow clock for reading the cart header makes sense, I guess there might be a header entry that indicates the supported speed, similar as in NDS carts(?), and then everything else should probably use that clock value.
At the moment, you have a slow clock for the 200h-byte data reads, and faster (and much faster) clocks for some of the 4-byte/0-byte commands. But well, maybe that's right... the ROM lookup time is indicated by the status bytes, but the following data stream might also have a restriction if the ROM is kinda slow.

Without NDMA, I always get less than 4.2MHz per byte, ie. less than 1MHz per word... I don't understand how my software data polling function could be that slow... there seems to be something wrong with my ARM9 CPU clock in general...
I have ARM9 code in AXI RAM, and ARM9 code cache enabled (although without PU), in theory, ARM9 should then run at 67MHz, shouldn't it?
But it looks more as if it is only running at about 4MHz or the like. I am having that problem elsewhere, too: I have a hardcoded keypad repeat delay, that works fine on ARM11 at 268MHz, of course it would be 4x slower on ARM9... but it looks more like being 40x slower : /

Is there are control register that could make the ARM9 run that slow?
Or do I need the PU for code cache on ARM9?
One ARM11 the code cache works without MMU.
I looked at P9 (and i really don't like looking at P9 code since it's just disgusting). For ROM read it looks like it's using a faster clock than what i use.

Code: Select all

int __fastcall CTRCARD_cmdBFReadRom(int a1, int a2, __int64 a3, unsigned __int16 a4)
{
  int v4; // r5
  signed int v5; // r0
  __int64 v6; // r6
  int result; // r0
  int v8; // r3
  int v9; // r0
  int v10; // r2
  signed int v11; // r5
  signed int v12; // r6
  signed int v13; // r3
  signed int v14; // r2
  signed int v15; // r2
  signed int v16; // r3
  signed int v17; // r2
  int v18; // r2
  signed int v19; // r3
  int v20; // r2

  v4 = a1;
  v5 = *(_DWORD *)(a1 + 16);
  v6 = a3;
  *(_DWORD *)(v4 + 16) = v5 + 1;
  if ( v5 > 10000 )
  {
    result = CTRCARD_cmdC5CheckStatus2(v4);
    if ( !result )
      return result;
    *(_DWORD *)(v4 + 16) = 0;
  }
  result = sub_8050334(v4, 0);
  if ( result )
  {
    v8 = (unsigned __int64)(v6 >> 23) | 0xBF000000;
    v9 = (unsigned __int8)byte_8094821;
    v10 = (_DWORD)v6 << 9;
    v11 = 1024;
    if ( byte_8094821 == 2 )
      v12 = 1024;
    else
      v12 = 0;
    REG_CTRCARD_CNT[v12 + 11] = v8;
    if ( v9 == 2 )
      v13 = 1024;
    else
      v13 = 0;
    REG_CTRCARD_CNT[v13 + 10] = v10;
    if ( v9 == 2 )
      v14 = 1024;
    else
      v14 = 0;
    REG_CTRCARD_CNT[v14 + 9] = 0;
    if ( v9 == 2 )
      v15 = 1024;
    else
      v15 = 0;
    REG_CTRCARD_CNT[v15 + 8] = 0;
    if ( v9 == 2 )
      v16 = 1024;
    else
      v16 = 0;
    REG_CTRCARD_CNT[v16 + 1] = a4 - 1;
    if ( v9 == 2 )
      v17 = 1024;
    else
      v17 = 0;
    v18 = (REG_CTRCARD_CNT[v17] & 0x10000000) - 0x40000000;
    if ( v9 == 2 )
      v19 = 1024;
    else
      v19 = 0;
    v20 = ((v18 | REG_CTRCARD_CNT[v19] & 0x8000000) + 0x1008000) | 0x4822C;
    if ( v9 != 2 )
      v11 = 0;
    REG_CTRCARD_CNT[v11] = v20;
    result = 1;
  }
  return result;
}
The final CNT value seems to be 0xD104822C. I'm not sure if this was choosen because of data corruption (another long standing issue) or just copy & paste. And not sure if this is actually clock or just a delay. I need to hook up the logic analyzer to get an idea if these bits control clock.

You are not turning on the data cache. It makes a huge difference if i recall correctly. And this directly from the ARM TRM:
2.3.5. Register 1, Control Register
Bit 12, Instruction cache enable
...
Controls the behavior of the instruction cache. To use the instruction cache, both the protection unit enable bit (bit 0) and the instruction cache enable bit must be set. This can be done with a single write to register 1.
It says the same about the data cache bit. You should read the documentation carefully.

nocash wrote:
Fri Mar 13, 2020 2:31 pm
profi200 wrote:
Thu Mar 12, 2020 9:53 am
Good to finally know what these "CARDCYCLES" regs do. I have just copied the values from P9 for them. I would recommend you not change these delays because we had lots of troubles with genuine cards and flashcards.
No risk, no fun ; )
In this case, the delays seem to occur before/after powering the card VCC pin, so the delays could hardly affect the card operation. Or only indirectly:
PowerOff delay might be needed for write-completion, IF there is any such feature, and IF anything was written.
A (very) short Insert delay could trigger switch bounce on the insert/eject signal.
If the user does insert the card slowly, then the insert flag may trigger before the card is fully inserted. But that does also happen with the official delay setting. The best workaround would be to do some Reset+Retries if the card replies with different chip IDs or other corrupt data during card initialization.

I have also had a short look at the encryption. Sending a nonsense command (which should return FFh's) and setting the seed to all 00h's or all FFh's did still return heavily encrypted data (instead of plain FFh's). Changing only a single seed bit does seem to affect most or all data bytes. So it's apparently really something like AES, the key-init-busy-flag does also indicate that there is some AES-style key schedule going on.
One small detail is that CTRCNT_SECCNT bit0-1 set to 2 does cause the hardware to ignore the written seed value. It does completely ignore the whole seed, including 1st and 4th word (ie. those bits don't seem to be for AES key/iv size selection).
What makes you think they are before VCC goes high? And certain delays are needed. I would compare this with SD cards. The spec tells you exactly the time until voltage needs to be stable and the max ramp up time aswell. It also tells you to send at least 74 clock pulses after voltage stabilized and before you send any cmd. These ROM chips probably have requirements too and they will behave unpredictable if these are not met.

As for protocol encryption:
The current guess is it's AES CTR with some unknown blackbox key scrambler. And the seed stuff is only one part of the scrambler.

Post Reply