Writing my own assembler

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

User avatar
Nioreh
Posts: 116
Joined: Sun Jan 22, 2012 11:46 am
Location: Stockholm, Sweden

Re: Writing my own assembler

Post by Nioreh » Wed Oct 10, 2018 6:38 am

pubby wrote:You're never going to finish a NES game if you spend all your time making tools :P

With that said, I'd like it if all labels used the anonymous label +/- syntax. For example, if you have multiple labels with the same name, you can use +/- to distinguish them:

Code: Select all

jmp foo:+
foo:
jmp foo:++
foo:
foo:
jmp foo:---
In other words, have labels and anonymous labels behave the same. Don't special case either.
That seems like it makes the code very hard to follow. I don't even fully understand which jmp goes to which foo in your example :)

User avatar
never-obsolete
Posts: 381
Joined: Wed Sep 07, 2005 9:55 am
Location: Phoenix, AZ

Re: Writing my own assembler

Post by never-obsolete » Wed Oct 10, 2018 7:05 am

The problem with that is that all these extra labels will get exported to label files along with the ones that are actually relevant for debugging. Maybe the solution is not to change ENUM, but to take a cue from ca65 and implement two forms of assignment for symbols, one that marks the symbol as a label (:=) and one that doesn't (=).
I had a similar problem and ended up forking asm6 and adding RAM/ENDRAM (as well as WRAM/ENDWRAM/SRAM/ENDSRAM, since Mesen seems to differentiate) which is behaves just like ENUM/ENDE does. That way, anything defined with EQU, =, or in an ENUM is not exported to the label file.

Years ago I attempted my own assembler, and I took hints on the macro system from (I think) nesasm, where you could do something like this:

Code: Select all

MACRO addvtop
	lda pos@0, X
	clc
	adc vel@0, X
	sta @1
ENDM

...and then in code...

	addvtop _x, t0
	addvtop _y, t1

...which would resolve to...

	lda pos_x, X
	clc
	adc vel_x, X
	sta t0

	lda pos_y, X
	clc
	adc vel_y, X
	sta t1
It did limit your arguments to 10 (@0 ... @9), but as far as I've tried, asm6 cannot handle something like this.

edit:

The other thing you might consider supporting is reading in a rom file first, overwriting it with the code your assembler produces, and then writing that to the outfile. This is useful for rom hacking, and the only reason I still use my assembler from time to time.
. That's just like, your opinion, man .

User avatar
gauauu
Posts: 703
Joined: Sat Jan 09, 2016 9:21 pm
Location: Central Illinois, USA
Contact:

Re: Writing my own assembler

Post by gauauu » Wed Oct 10, 2018 7:26 am

tokumaru wrote: - CHARACTER MAPPING: Mapping characters to specific indices is important because we usually have few tiles to dedicate to text so we can't afford to be slaves of the ASCII encoding. The idea here is to use a directive to define an index, and then the character to put at that index. The reason to supply the parameters in this order is that after the index, you can supply multiple characters and strings, and the index will auto-increment to accommodate as many characters as necessary (e.g. CHARMAP $00, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", " .!?", $0D).
I _love_ this. I always waste time on a project building a python script to handle converting text to whatever character mapping my game is using. Having that built-in to the assembler sounds great.

tepples
Posts: 22054
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Writing my own assembler

Post by tepples » Wed Oct 10, 2018 8:44 am

gauauu wrote:I always waste time on a project building a python script to handle converting text to whatever character mapping my game is using.
For simpler projects, I understand how preprocessing text with a script in a scripting language might feel wasteful. But for bigger projects, it's anything but.

In theory, an assembler could use this sort of mapping from multiple characters to one character to support UTF-8 input, where the multiple code units (that is, bytes) that represent a character get translated to a single code unit. It could also apply a dictionary, where commonly encountered groups of letters get translated into shorter groups. But it can't very easily calculate an appropriate dictionary given only a (suitably long) text. I have a preprocessor written in Python to do that; my NES and GB ports of 240p Test Suite and the next versions of Thwaite and the Action 53 menu all use a byte pair encoding (BPE)/digram tree encoding (DTE) engine that I originally wrote for my port of robotfindskitten.

A preprocessor also allows non-programmers, such as the translator you hired to prepare versions in other languages, to edit the text without breaking invariants that your program expects. And you'd need a pretty rich macro system to handle line breaking, pagination, stage directions for NPCs, hyperlinks for your dialogue tree, and other things that tend to get interleaved into your text. One "meant for consolidating repetitive assembly code, not for extending the functionality of the assembler" can't handle it alone.

User avatar
samophlange
Posts: 48
Joined: Sun Apr 08, 2018 11:45 pm
Location: Southern California

Re: Writing my own assembler

Post by samophlange » Wed Oct 10, 2018 8:52 am

FWIW - The people in my office that work in JS have switched our projects to using TypeScript. From what I understand it allows you to be more explicit about your intent when writing code, which enables much better compile time type checking. I have never done any real projects in JS, but every time I use it I think that it must be a nightmare to keep things clean on a large project, and I think TypeScript helps with that.

If you're taking feature requests.... :D
Something I've been struggling with as I learn assembly is that I haven't found a "nice" way to do if/else, it seems like it just turns in to a mess of label soup. Maybe this could be smoothed over with some syntactic sugar?

User avatar
tokumaru
Posts: 11863
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Writing my own assembler

Post by tokumaru » Wed Oct 10, 2018 11:11 am

pubby wrote:You're never going to finish a NES game if you spend all your time making tools :P
True.
In other words, have labels and anonymous labels behave the same. Don't special case either.
I kinda like this idea! For anyone who thinks this is confusing, just don't use the feature.
never-obsolete wrote:I had a similar problem and ended up forking asm6 and adding RAM/ENDRAM (as well as WRAM/ENDWRAM/SRAM/ENDSRAM
That feels way too platform-specific to me, I'm trying to keep things as generic as possible.
Years ago I attempted my own assembler, and I took hints on the macro system from (I think) nesasm
You know, NESASM gets a bad rap around these parts, but it actually has some very interesting features that are often overlooked. Too bad it has some quirks that make it unusable to me.
The other thing you might consider supporting is reading in a rom file first, overwriting it with the code your assembler produces, and then writing that to the outfile. This is useful for rom hacking, and the only reason I still use my assembler from time to time.
This is an interesting feature, and not hard to implement at all. Maybe an INCBIN that doesn't update the PC is all it takes.
samophlange wrote:If you're taking feature requests.... :D
Unless they're really simple to implement or seem really useful to me, no, I'm not! :lol:
Something I've been struggling with as I learn assembly is that I haven't found a "nice" way to do if/else, it seems like it just turns in to a mess of label soup. Maybe this could be smoothed over with some syntactic sugar?
One of the reasons I like assembly so much is that it isn't bound to the constructs of high-level languages. You can jump anywhere you want, take shortcuts, bypass instructions by making them look like operands, all sorts of convenient little things to get get that extra power from these limited pieces of hardware we write code for. So for me particularly, simulating high-level constructs isn't appealing at all. I'm pretty sure this can be done with macros in most assemblers, though.

Oziphantom
Posts: 914
Joined: Tue Feb 07, 2017 2:03 am

Re: Writing my own assembler

Post by Oziphantom » Wed Oct 10, 2018 10:25 pm

tepples wrote:
Oziphantom wrote:
tokumaru wrote:I'm only using Node to run the .js file locally as a command line application
cscript.exe will run JS as well without most people having to install something is all.
Last I checked, cscript.exe was exclusive to Microsoft Windows. I don't run Windows on my primary dev machine; nor does calima. Are there tips for writing a script to make it work on both cscript.exe (for users of Windows) and Node.js (for users of GNU/Linux and macOS)?
Seeing as people use JS for so much I had assumed it had evolved enough to be practical, but I see that is mostly incompetent still and you need CS extensions or node or insert a bunch of others to get stuff done. I was thinking one could just write JS and you could use cscript to run neat JS, and then linux/mac could use node if need be or some other js engine. Seems you can't.

never-obsolete wrote:The other thing you might consider supporting is reading in a rom file first, overwriting it with the code your assembler produces, and then writing that to the outfile. This is useful for rom hacking, and the only reason I still use my assembler from time to time.
Basically any assembler can do this, TASS64 for example does it.

Code: Select all

*=$0801 
.binary "original.bin"

*=$1000
<patch code goes here>

*=$1800
.byte $2c ; skip command here
samophlange wrote:Something I've been struggling with as I learn assembly is that I haven't found a "nice" way to do if/else, it seems like it just turns in to a mess of label soup. Maybe this could be smoothed over with some syntactic sugar?
You might want to try the https://github.com/Museum-of-Art-and-Di ... nt/macross assembler its the assembler for people who don't like assembly. You can't really do IF ELSE in an assembler as that is a structure system, so at some point it has to know where to jump to, to skip the "else" which is not something assemblers really do, it needs a label. However I would think if one tried you could get it to work with Macros in TASS64, but maybe not with nesting....

User avatar
Bregalad
Posts: 7951
Joined: Fri Nov 12, 2004 2:49 pm
Location: Chexbres, VD, Switzerland

Re: Writing my own assembler

Post by Bregalad » Wed Oct 10, 2018 11:21 pm

gauauu wrote:
tokumaru wrote: - CHARACTER MAPPING: Mapping characters to specific indices is important because we usually have few tiles to dedicate to text so we can't afford to be slaves of the ASCII encoding. The idea here is to use a directive to define an index, and then the character to put at that index. The reason to supply the parameters in this order is that after the index, you can supply multiple characters and strings, and the index will auto-increment to accommodate as many characters as necessary (e.g. CHARMAP $00, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", " .!?", $0D).
I _love_ this. I always waste time on a project building a python script to handle converting text to whatever character mapping my game is using. Having that built-in to the assembler sounds great.
This is hardly anything new, I use this feature in WLA-DX and other assemblers probably already have it.

lidnariq
Posts: 9689
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: Writing my own assembler

Post by lidnariq » Wed Oct 10, 2018 11:34 pm

ca65 and cc65 support remapping also, although not in the condensed format above. See the .charmap pseudo-op and #pragma charmap

Oziphantom
Posts: 914
Joined: Tue Feb 07, 2017 2:03 am

Re: Writing my own assembler

Post by Oziphantom » Wed Oct 10, 2018 11:35 pm

Bregalad wrote:
gauauu wrote:
tokumaru wrote: - CHARACTER MAPPING: Mapping characters to specific indices is important because we usually have few tiles to dedicate to text so we can't afford to be slaves of the ASCII encoding. The idea here is to use a directive to define an index, and then the character to put at that index. The reason to supply the parameters in this order is that after the index, you can supply multiple characters and strings, and the index will auto-increment to accommodate as many characters as necessary (e.g. CHARMAP $00, "ABCDEFGHIJKLMNOPQRSTUVWXYZ", " .!?", $0D).
I _love_ this. I always waste time on a project building a python script to handle converting text to whatever character mapping my game is using. Having that built-in to the assembler sounds great.
This is hardly anything new, I use this feature in WLA-DX and other assemblers probably already have it.
The problem with WLA-DX is it only lets you have one definition.. sub par.

User avatar
cpow
NESICIDE developer
Posts: 1097
Joined: Mon Oct 13, 2008 7:55 pm
Location: Minneapolis, MN
Contact:

Re: Writing my own assembler

Post by cpow » Thu Oct 11, 2018 5:10 am

pubby wrote:You're never going to finish a NES game if you spend all your time making tools :P
I'll drink to that. :beer:
Now back to my tools...

qalle
Posts: 50
Joined: Wed Aug 16, 2017 12:15 am

Re: Writing my own assembler

Post by qalle » Thu Oct 11, 2018 5:30 am

The Wikipedia article on lexical analysis seemed helpful when I started writing an assembler.

Also, it would be nice if the assembler supported the bit shift operators in expressions. (Ophis seems to lack them, perhaps because the characters "<" and ">" have other uses.)

Edit: removed a feature request

User avatar
cpow
NESICIDE developer
Posts: 1097
Joined: Mon Oct 13, 2008 7:55 pm
Location: Minneapolis, MN
Contact:

Re: Writing my own assembler

Post by cpow » Thu Oct 11, 2018 6:43 am

qalle wrote:The Wikipedia article on lexical analysis seemed helpful when I started writing an assembler.
Back in the days when I rolled my own assembler for NESICIDE I used Lex/Yacc. Looking back on it, it was so much fun I might try to replicate the experience with ANTLR. I see ANTLR already has a contributed 6502 grammar file.

User avatar
cpow
NESICIDE developer
Posts: 1097
Joined: Mon Oct 13, 2008 7:55 pm
Location: Minneapolis, MN
Contact:

Re: Writing my own assembler

Post by cpow » Thu Oct 11, 2018 10:39 am

yaros wrote:I still think it worth to hack around and write assembler without parser.
I'll never understand the logic of "I can write a better parser from scratch than any of the parser generators available to me." Like any tool, a parser generator allows you to focus on the meat and potatoes, not the plate.

User avatar
tokumaru
Posts: 11863
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Writing my own assembler

Post by tokumaru » Thu Oct 11, 2018 11:33 am

I don't plan on using any libraries right now besides the ones I'm required to in order to do basic tasks like interacting with the file system. I am taking a lot of shortcuts in this first moment though... For example, I'm not parsing expressions manually, I'm using JavaScript's eval() for this (don't judge me, this is for my personal use!), after using regular expressions to convert expressions like $8004 + <MyLabel into 0x8004 + executeFunction("<", getSymbol("MyLabel")). The functions "executeFunction" and "getSymbol" will be responsible for returning the correct values or throwing errors when appropriate.

Post Reply