Page 1 of 8

DISASM6 v1.5 - Nes oriented disassembler producing asm6 code

Posted: Wed Feb 09, 2011 10:26 am
by frantik
DISASM6 is a multi-pass NES-oriented disassembler which produces ASM6 code.


Features:
* produces instantly re-assemblable code (without any human modification)
* iNES header support
* Can export CHR-ROM
* can use optional NES registers
* can use custom defined labels
* can use FCEUDX code/data logs


The output can be reassembled using ASM6. I've tested it with a handful of roms and so far every mapper 0 rom has assembled into a 1:1 copy of the original. For 16k games that have 2 copies in in the .nes file, it will tweak the iNes header from 2 prg banks to 1 unless you disable 16k checking

Download Dasm6 v1.5

Please let me know about any bugs and any suggestions you have.

Some future ideas:
* may add support for mappers

Windows EXE and PHP source included

example output

Re: DASM6 - Nes oriented disassembler producing asm6 code

Posted: Wed Feb 09, 2011 12:37 pm
by koitsu
frantik wrote:... it's a 2 pass 6502 disassemler written in PHP
Image

Posted: Wed Feb 09, 2011 1:49 pm
by clueless
While a disassembler is cool, I too am bemused by the choice of PHP. I suppose if that is all that you know, then fine. I am reminded of a quote that I'm totally going to screw up: "If you only have a hammer, then every problem is a nail" (or something).

And I say the above half jokingly. You've produced working code, and decided to share it with everyone. That is far more than most people have done. I congratulate you on your accomplishment, even if I snicker at PHP itself. Just don't port it to COBOL, FORTRAN or JCL, ok? ;)

Side-note: The core engine for "cacti" is written in PHP (web page + batch processing scripts).

What would be really awesome is if you could identify, via static code analysis, code from data. I can think of several heuristics for this.

I once worked on a static code analyzer for intelligently disassembling NES roms. It would start with each of the three interrupt vectors and trace all possible code execution paths. I then realized that I had bitten off more than I wanted to when I needed to handle constructs like this:

Code: Select all

tay
lda some_tbl_msb, y
pha
lda some_tbl_lsb, y
pha
php
rti
But if you could nail stuff like that, it would be TOTALLY AWESOME.

Posted: Wed Feb 09, 2011 3:14 pm
by frantik
wow, who would expect people who program for an ancient and obsolete processor would make fun of someone else for their choice of programming languages. ... snip ...

Posted: Wed Feb 09, 2011 3:16 pm
by 3gengames
frantik wrote:wow, who would expect people who program for an ancient and obsolete processor would make fun of someone else for their choice of programming languages. fucking idiots

Lol, we might program for an ancient processor, but can YOU handle the challenge? NES compiler version, please!

Posted: Wed Feb 09, 2011 3:24 pm
by frantik
3gengames wrote:Lol, we might program for an ancient processor, but can YOU handle the challenge? NES compiler version, please!
yeah, i'll leave that up to you (though i do also write 6502 asm)

Posted: Wed Feb 09, 2011 3:42 pm
by 3gengames
frantik wrote:
3gengames wrote:Lol, we might program for an ancient processor, but can YOU handle the challenge? NES compiler version, please!
yeah, i'll leave that up to you (though i do also write 6502 asm)
As if PHP were more powerful then 6502? Hahaha!

Posted: Wed Feb 09, 2011 3:44 pm
by frantik
anyways, on topic
clueless wrote:What would be really awesome is if you could identify, via static code analysis, code from data. I can think of several heuristics for this.
yes that would be cool but i think beyond the scope of this disassembler

besides, i dont think it would be very effective at finding small bits of data in bit blocks of program code. finding big blocks of data isn't hard because there are lots of invalid opcodes

it might not hurt to make 2 kinds of labels though.. ones used by LDA/X/Y and ones used by JMP/JSR. then it would be even more obvious where data is and where program code is

the main thing i want to do right now is make sure all labels which are referenced actually exist in the program code so that the output can immediately be reassembled without any modification

Posted: Wed Feb 09, 2011 3:57 pm
by clueless
I take it that one of your goals is that a re-assmbled NES file should be identical to the one disassembled? If so, you could stress test this by writing a small program to generate totally random NES roms (with valid ines header). Then automate the disassembly and reassembly of the image, and compare the MD5 checksums. If a test fails, set the original ROM aside for later analysis.

Would be even cooler if your randomly generated ROMs used a byte distribution that is typical of most NES games.

Do you have a fairly large corpus of ROM images to test with already?

And the FORTRAN thing.. I'm sorry if I touched a nerve. I was just teasing (just a little). I've had far worse. Sivak told me that my game sucked. Period. :( The guys at work make fun of me for using C, C++ and Perl. Yet my code can run circles around their fancy java frameworks. Doesn't stop them from being dicks about it though. I'm sorry if I was a dick to you. I didn't intend to be.

Posted: Wed Feb 09, 2011 4:58 pm
by frantik
clueless wrote:I'm sorry if I was a dick to you. I didn't intend to be.
it's all good.. i had literally just woken up lol so i was a lil bitchy.. it was more the combination of the facepalm and your post.. but it's all good :)


and yeah the object is to create an assembly file that generates identical output as the original file. i haven't gotten to that point yet though.. but asm6 doesn't generate any errors besides missing labels so thats good

Posted: Wed Feb 09, 2011 5:41 pm
by cartlemmy
I'm sometimes astounded by the negativity that is thrown about on these forums.

But anyhow your disassembler looks cool. I program in PHP quite often, and though it is a bit of a retarded beast, some cool stuff can be done with it. If you're a good programmer, that's all that matters.

Posted: Wed Feb 09, 2011 6:21 pm
by thefox
clueless wrote:What would be really awesome is if you could identify, via static code analysis, code from data. I can think of several heuristics for this.
Also maybe think about emulator assisted disassembly. FCEUX can make code/data logs already.

E: Wohoo 400th post...

E2: Also forgot to say: nice work OP.

Posted: Wed Feb 09, 2011 6:45 pm
by clueless
thefox wrote:
clueless wrote:What would be really awesome is if you could identify, via static code analysis, code from data. I can think of several heuristics for this.
Also maybe think about emulator assisted disassembly. FCEUX can make code/data logs already.
Yup, and that was instrumental in my reverse engineering of Crystalis back in 2001 or so. However, a complete disassembly requires playing the game such that you access every memory location, or create a TAS that does the same.

Posted: Wed Feb 09, 2011 7:25 pm
by frantik
thefox wrote:
clueless wrote:What would be really awesome is if you could identify, via static code analysis, code from data. I can think of several heuristics for this.
Also maybe think about emulator assisted disassembly. FCEUX can make code/data logs already.
hmm good point.. i will see if i can incorporate those somehow :)

Posted: Wed Feb 09, 2011 8:50 pm
by cpow
frantik wrote:
thefox wrote:
clueless wrote:What would be really awesome is if you could identify, via static code analysis, code from data. I can think of several heuristics for this.
Also maybe think about emulator assisted disassembly. FCEUX can make code/data logs already.
hmm good point.. i will see if i can incorporate those somehow :)
Yes...but....what about bankswitching? JSR to $c039 might mean "jump into this pile of data" if the disassembler doesn't know that the code it just disassembled prior to this caused the MMC1 to swap the high bank. ?

Now...disassembly aided by a code/data log file generated by a run of the ROM in an emulator like FCEUX or NESICIDE...that is not static, but accurate. In fact, isn't that what people most often use to rip NSFs?