Recompiling assembly language to C

You can talk about almost anything that you want to on this board.

Moderator: Moderators

tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Recompiling assembly language to C

Post by tepples »

There's a principle in software engineering: Don't Repeat Yourself. By this, Hunt and Thomas mean "Every piece of knowledge must have a single, unambiguous, authoritative representation within a system." But unfortunately, technical and political obstacles inherent in certain platforms don't always make this easy.

Some time ago, I wrote a proof-of-concept implementation of 6502 assembly language as ca65 macros that output .byte statements. I intended for this to lead to an assembler that emits SPC700 opcodes, so that I can share music sequence interpretation code between NES and Super NES versions of a music engine. And blargg delivered, first by making a Sony-syntax assembler as a macro pack, then by adding a layer of 65C02-to-Sony preprocessor macros on top of that.

Today I noticed that Leushenko of Programming Puzzles & Code Golf Stack Exchange implemented something along the same lines: an x86-to-C assembler as a set of C preprocessor macros. In theory, this would help with porting old PC games that use assembly language subroutines to modern ARM-based platforms. And doing something similar for 6502 would allow sharing game logic code between NES and PC versions of a game.

Very polyglot. Such portability. Wow.


keywords: assembly transpiler
User avatar
Dwedit
Posts: 4924
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: Recompiling assembly language to C

Post by Dwedit »

Sounds like static recompilation all over again, but with C code instead of ASM output.
I was reading about static recompilation once, and the thing that stood out the most was the optimization of deferring flag calculation until you actually need its value.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
cyc
Posts: 20
Joined: Tue May 26, 2009 5:39 am

Re: Recompiling assembly language to C

Post by cyc »

tepples wrote:And doing something similar for 6502 would allow sharing game logic code between NES and PC versions of a game.
this remind me of Microchess C Emulation
Download the Microchess C Emulation, which includes Microchess for the Kim-1 with 6502 to C macros and an exe file for playing Microchess on a PC as created by Bill Forster.
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Recompiling assembly language to C

Post by Bregalad »

While it's usually always possible to translate source code from a language to another automatically, the result will usually suck and be unusable, just like machine translation of a human language.

I don't really get the point, if you want to get high quality source code that is understandable and maintenable to some extent you'll have to translate manually anyways. And if you don't want this, why translate in the 1st place ? It's probably much easier / makes more sense to interpret the 1st language in a source code written in the 2nd language.

PS : What I said doesn't apply to high level langauge -> assembly/binary translation which is compilation, but applies to all other kind of translations.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Recompiling assembly language to C

Post by tepples »

Manual translation violates Don't Repeat Yourself. It allows bugs to exist in the logic of one version but not in another. And there's no way to efficiently propagate changes from the original to a manual translation.

Are you claiming that you would prefer to embed a 6502 interpreter in a PC version, or an interpreter for some other language in both the PC version and the NES version? The overhead of an interpreter has a runtime speed or electric power cost (which may or may not be negligible).
User avatar
Movax12
Posts: 541
Joined: Sun Jan 02, 2011 11:50 am

Re: Recompiling assembly language to C

Post by Movax12 »

Translation of assembly to C could be done manually without potentially changing any of the logic or introducing bugs if the C compiler for the target CPU can be controlled in such a way to produce an identical binary file(s).
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Recompiling assembly language to C

Post by lidnariq »

I'm kind of amused at the idea of throwing a C compiler—which theoretically has a code optimizer that's all about fixing code to be fast instead of understandable—and repurposing it to do static recompilation.

I'm a little concerned that older ISAs with their paucity of first-class registers, and with having to emulate smaller register sizes, won't translate well; theoretically this is something currently fixed with hardware-level register renaming instead of a higher-level software solution.

The Microchess example is a good data point for this purpose, though: e.g. the STRATGY function was 55 lines of 6502 and approximately 110 bytes, becomes 185 lines of x86_64 and approximately 680 bytes.
zzo38
Posts: 1096
Joined: Mon Feb 07, 2011 12:46 pm

Re: Recompiling assembly language to C

Post by zzo38 »

tepples wrote:Are you claiming that you would prefer to embed a 6502 interpreter in a PC version, or an interpreter for some other language in both the PC version and the NES version?
I would prefer to just make the PC version include an emulator (and possibly enhancements too, dealing with graphics, input, music, and save files), and then the ROM image can be transferred to run on any other computers with an emulator, and on EEPROM cartridges.
(Free Hero Mesh - FOSS puzzle game engine)
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Recompiling assembly language to C

Post by Bregalad »

Manual translation violates Don't Repeat Yourself. It allows bugs to exist the logic of one version but not in another.
What about tossing the version in the older language ?

When I translated GBAMusRiper from Java to C++ as I've realized a year after I made the wrong choice of a language, I tossed the old Java version. I was able to make the code more elegant and add some functionalities while translating, too.
Are you claiming that you would prefer to embed a 6502 interpreter in a PC version, or an interpreter for some other language in both the PC version and the NES version? The overhead of an interpreter has a runtime speed or electric power cost (which may or may not be negligible).
Wow it depends so much on the application you can't have general rules about this kind of stuff.

I'd say the only general rule when it comes to computer science is that there is no general rule ;)
I hate people saying stuff like "you should NEVER use break or continue" or "you should never program in assembly" or whatever. It just depends on what you want to do and what are your goals. Just know what you're doing and think like a grow up man with common sense, instead of blindly following rules other random people made (if those people are famous, it's even one more reason NOT to follow them blindly).

If you're targeting two platforms with different CPUs, and want to maintain the code on both at the same time, assembly is probably a terrible choice. But if you already coded in 6502 assembly and refuse to translate to something else, then an interpreter is definitely the best solution.
The resulting code probably won't be significantly less per formant that a horrible decompilation and recompiled zombie C code, but writing an interpreter is much more elegant.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Recompiling assembly language to C

Post by rainwarrior »

For porting an old assembly program to a new architecture that outclasses it, this is a fine solution. It's probably about the same complexity as an emulator to write, but by doing the transcode statically you get to trade code size for a more efficient emulation, especially if you've got a good optimizing compiler to feed it into.

I don't think I'd want to use this to cross develop NES + PC though. I am currently working on an NES game by writing it in C++ for PC with the NES limitations in mind, then manually porting the code to NES assembly once I have finished iterating on the C++ version. I find it insanely easier to make iterations and changes to C++ code vs assembly, so I think there would be a lot lost by doing primary development in assembly.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Recompiling assembly language to C

Post by tepples »

rainwarrior wrote:I am currently working on an NES game by writing it in C++ for PC with the NES limitations in mind
And I've done so in Python, both for subroutines and for main. Mr. Podunkian did it for STREEMERZ, with thefox completing the port.
then manually porting the code to NES assembly once I have finished iterating on the C++ version.
The problem with this sort of waterfall model comes when you think you've "finished iterating" but you think of changes to make after you've made progress on the port. Unless you abandon the original C++ version entirely, you have to maintain them in parallel.
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Recompiling assembly language to C

Post by Bregalad »

For porting an old assembly program to a new architecture that outclasses it, this is a fine solution. It's probably about the same complexity as an emulator to write, but by doing the transcode statically you get to trade code size for a more efficient emulation, especially if you've got a good optimizing compiler to feed it into.
Even the best optimizing compilers are designed to work with "normal" code. Decompiled assembly code is going to look like a huge mess (with statements like a = , x =, y = and so on for emulating the corresponding 6502 instructions), and the optimizing compiler isn't likely to find the corresponding assembly code again.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Recompiling assembly language to C

Post by rainwarrior »

I mean that an optimizer run on what is essentially an "unrolled" emulation of a program should perform significantly better than the equivalent emulator. Whether or not the compiler can do optimizations has nothing to do with "normal" code, machine generated code can optimize well or poorly, and the generator can be tuned to create well-optimizing code if you know enough about the compiler it's going to feed.

I actually maintained a code generator in this way on my last job; it was an interesting experience. Occasionally you do get unlucky and your first approach generates worst-case code for the compiler, but you can fix this if you're paying attention. A lot of the time you get to watch large chunks of generated code just evaporate in the optimization step.

There is not really any hope of recovering the original assembly via a normal compiler, but why would you be compiling back to 6502? My statement about efficiency was not a comparison to well crafted assembly.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Recompiling assembly language to C

Post by tepples »

Just speculating, but another advantage of transpiling assembly language to HLL is that it might help get a proposal for a port of a retro game to a modern console past the console maker's concept approval department. I haven't seen a modern console's devkit or NDA myself, but I imagine some console makers have become more skittish about using an emulator so as not to leave open the possibility of anything like ROM injection on Nintendo's Virtual Console and Popstation on PSP.
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: Recompiling assembly language to C

Post by Oziphantom »

In case you don't know it https://github.com/MitchellSternke/SuperMarioBros-C

However there is also http://www.decompiler.org/

I would think doing a ASM -> ASM cross compile would be easier. ASM doesn't have to be structured which can get you into trouble fast, but most ASMs worth mostly the same way. So translating 6502-> Arm or ARM->x86 or x86->ARM would be easier.
That being said any "port" you are thinking about a "simulator" of the host CPU would probably get the job done.
Post Reply