It is currently Fri Nov 16, 2018 9:12 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 9 posts ] 
Author Message
 Post subject: Finding JSRs in PRG ROMs
PostPosted: Fri Aug 17, 2018 12:34 pm 
Offline
User avatar

Joined: Sat May 04, 2013 6:44 am
Posts: 33
I'm trying to do some static analysis of FDS titles, identifying and counting which FDS BIOS APIs they call. Following the classic strategy of "do the simplest thing that could possibly work," my first attempt just scanned the ROM for byte sequences of $20 $xx [$Ex|$Fx], in other words, "JSR $Exxx" or "JSR $Fxxx."

That resulted in lots of false positives, as you might imagine. I have reduced the false positives by

1. Only scanning the PRG files in the disks (block type 4, file type 0)
2. Excluding results that are immediately followed by an illegal opcode, on the unproven assumption that no FDS title would use "unofficial" opcodes,
3. Rejecting matches that point before the first "public" API that starts at $E149, because that's known to be character data.

I have considered rejecting matches that fall between entrypoints in the BIOS, or in other words, JSRs into the middle of BIOS functions. That said, I'm not convinced that all FDS titles are this well-behaved.

I still get a ton of false positives, likely because of instruction alignment. What other strategies might I use to reliably identify JSRs into the FDS BIOS?

I imagine that the "right" way to do this is to disassemble the PRG files starting from some known good address, but that's harder than it sounds. I can place each file in RAM correctly using the load address field in the preceding block 3.
- If I disassemble from the beginning of the loaded file, that assumes that byte 0 of each file is program code, which it might not be.
- If there are gaps between functions, they may contain garbage data that looks like code and throws off subsequent disassembly.
- If I try to do some kind of static execution tree analysis, following JSRs and branches, I would be stymied by jumps from one file to another, as I won't know which files are intended to be loaded simultaneously.
- I could reject apparent jumps that land between instructions of the FDS BIOS, which I could get by looking at the FDS BIOS disassembly. That doesn't solve the false positives that land on opcodes, however.
- Finally, I'm pretty much SOL if there's any self-modifying code that writes or modifies JSR instructions. I don't know if any FDS software does that.

FWIW, yes I know the public FDS documentation on the Wiki says that there are no public APIs in the $Fxxx range, but I've already identified some errors in the Wiki, so I need to prove this for myself.


Top
 Profile  
 
PostPosted: Fri Aug 17, 2018 12:49 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6953
Location: Canada
CC65's disassmbler will let you use an accompanying "Info file" which you can use to tell it about various regions, which are code or data, where to start disassembling, where they should appear in CPU address, etc.
http://cc65.github.io/doc/da65.html

Create the info file, and disassemble. Anything that looks wrong, write a new entry in the info file to mark it as data or code, or whatever needs to be done to make it right. Disassemble again... repeat until the whole disassembly looks good. (Since the info file can tell the disassembler where the data belongs in CPU space, you can disassemble the FDS file directly once you put in a few lines to tell it where the blocks in the file are loaded to.)


Code data logs are the usual place to start when trying to make a disassembly, but I'm not sure if there is an emulator that does FDS code data logging. Both FCEUX and Mesen have code data log options you can turn on, then play the game for a while to log which parts of a ROM are code or data. These tend to give a very good starting point of information for disassembly, but it always misses stuff that you have to resolve by hand. (If you're lucky one of these already has support for FDS, I haven't tried... but I'm guessing they don't.)

There also exist interactive 6502 disassemblers, which follow code through branches for you, etc. but they tend to be expensive commercial products.


Top
 Profile  
 
PostPosted: Fri Aug 17, 2018 12:53 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3682
Location: Mountain View, CA
LightStruk wrote:
I still get a ton of false positives, likely because of instruction alignment. What other strategies might I use to reliably identify JSRs into the FDS BIOS?

Use of an emulator with a good debugger or "logger", thus seeing real-time instructions executed, is really the only way you're going to address "edge cases" where disassembly reverse-engineering falls short.

Speaking purely for myself, a combination of the two tools tends to result in the best results, combined with spending less amount of time than having used just one tool.

IDA Pro (commercial -- and expensive) might help you with general analysis, as it's pretty good with 6502, but it cannot handle things at run-time because it's a "deep" disassembler/analyser and not an emulator.

LightStruk wrote:
FWIW, yes I know the public FDS documentation on the Wiki says that there are no public APIs in the $Fxxx range, but I've already identified some errors in the Wiki, so I need to prove this for myself.

No surprise -- because the FDS is one of the most neglected peripherals/systems there is, documentation-wise. It's always been this way. The fact it's Famicom-centric is partially responsible, the other being that it's "floppy-based" which made reverse-engineering during the 90s very painful (I only know of *one single person* during the height of emulation that focused on trying to understand document FDS bits... and it was a guy in Taiwan). In other words: any FDS documentation errors you find *anywhere* do not surprise me, and your efforts to fix them are highly appreciated!


Top
 Profile  
 
PostPosted: Fri Aug 17, 2018 1:03 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6953
Location: Canada
LightStruk wrote:
I know the public FDS documentation on the Wiki says that there are no public APIs in the $Fxxx range, but I've already identified some errors in the Wiki, so I need to prove this for myself.

If you find errors in the wiki please point them out and we'll fix them.

Also, you've probably seen it, but just to make sure, there's an old FDS doc by Brad Taylor that has an extensive disassembly of the BIOS with a lot of commentary:
http://nesdev.com/FDS%20technical%20reference.txt


Top
 Profile  
 
PostPosted: Fri Aug 17, 2018 1:13 pm 
Offline
User avatar

Joined: Sun Mar 19, 2006 3:06 am
Posts: 585
Location: Gothenburg/Sweden
Perhaps my old FDS-Explorer might come in handy? :) It has a built-in disassembler (perhaps not state of the art but still..)
https://www.romhacking.net/utilities/662/
or
https://nes.goondocks.se/software.php

_________________
http://nes.goondocks.se/


Top
 Profile  
 
PostPosted: Fri Aug 17, 2018 1:29 pm 
Offline
User avatar

Joined: Sat May 04, 2013 6:44 am
Posts: 33
Thanks everyone for replying so quickly!
rainwarrior wrote:
If you find errors in the wiki please point them out and we'll fix them.
Of course. I plan to submit fixes once I have completed my investigation.
oRBIT2002 wrote:
Perhaps my old FDS-Explorer might come in handy?
Thank you for calling it to my attention. I will definitely check it out.


Top
 Profile  
 
PostPosted: Fri Aug 17, 2018 3:30 pm 
Offline
User avatar

Joined: Fri Nov 12, 2004 2:49 pm
Posts: 7572
Location: Chexbres, VD, Switzerland
Quote:
2. Excluding results that are immediately followed by an illegal opcode, on the unproven assumption that no FDS title would use "unofficial" opcodes,

FDS BIOS commonly uses a special techniques where arguments to a subroutine are placed in ROM after the "jsr" instruction (and the return address is increased accordingly); so no this is not a good idea, you're going to exclude actual JSR opcodes if you do that.

To be honnest using FCEU's debugger sounds a good idea in your case, at least at start.

The FDS wiki page was written mostly by me based on existing doccumentation such as Brad Taylor's, and is definitely not complete. I hope you'll be able to contribute to complete it (that the whole point of a wiki).


Top
 Profile  
 
PostPosted: Fri Aug 17, 2018 5:27 pm 
Online
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 4105
Consider matching only the well-established entry points.

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
PostPosted: Sat Aug 18, 2018 4:04 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 624
I've recently upload this https://github.com/oziphantom/CodeTree its C64 centric at the moment, but shouldn't be to hard to add a "load from XXX" function to it. It will make a graph, which will allow you to quickly see anything that calls a region and then trace it back to the area of code. It also does some analysis.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: TmEE and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group