Writing my own assembler
Moderator: Moderators
Re: Writing my own assembler
yeah, I was confusing his assembler feature with the talk of the global jmp/label ret.
Re: Writing my own assembler
I use Node.js for command-line programs too (it doesn't need to be used for servers or GUI; as mentioned before, it is just another programming language and you could also use Python, Perl, or PHP). I think that Windows Script Host does not implement many ES6 features though? If you are writing a assembler in JavaScript you will likely want byte arrays.
I like the relative labels in Knuth's MIXAL and MMIXAL. If a label name is a digit and then H then you can access the nearest such label backward or forward by the number and then B or F respectively. (MMIXAL is strange and uses only a single pass though; forward references are resolved at load time instead. However, the same relative label format could be used in multi-pass assemblers too.)
I also like the nonstandard syntax used in NESASM/MagicKit (indirect addressing uses square brackets, and zero-page addressing is explicit), but maybe you prefer the standard syntax.
I also tend to use macros to define jump tables and so on, rather than doing them manually, meaning a simple macro system might not do (although if the assembler is written in JavaScript, it would be possible to support extensions also written in JavaScript without too much difficulty).
I like the relative labels in Knuth's MIXAL and MMIXAL. If a label name is a digit and then H then you can access the nearest such label backward or forward by the number and then B or F respectively. (MMIXAL is strange and uses only a single pass though; forward references are resolved at load time instead. However, the same relative label format could be used in multi-pass assemblers too.)
I also like the nonstandard syntax used in NESASM/MagicKit (indirect addressing uses square brackets, and zero-page addressing is explicit), but maybe you prefer the standard syntax.
I also tend to use macros to define jump tables and so on, rather than doing them manually, meaning a simple macro system might not do (although if the assembler is written in JavaScript, it would be possible to support extensions also written in JavaScript without too much difficulty).
(Free Hero Mesh - FOSS puzzle game engine)
Re: Writing my own assembler
Exactly. You download the interpreter, and run your script through it, it's the same thing.zzo38 wrote:it is just another programming language and you could also use Python, Perl, or PHP).
That's what I meant by "outdated" a few posts ago. I think it's an old version of JavaScript, without little support for binary data and file manipulation.I think that Windows Script Host does not implement many ES6 features though?
Not only that, but being a popular tool that's actively maintained, there are several modules for all kinds of things you might need.If you are writing a assembler in JavaScript you will likely want byte arrays.
While square brackets for indirection makes a lot of sense in assembly (more than parentheses, I agree), there's just too much 6502 code out there using a standard that's probably as old as the CPU itself, and a change like that causes unnecessary confusion IMO.I also like the nonstandard syntax used in NESASM/MagicKit (indirect addressing uses square brackets, and zero-page addressing is explicit), but maybe you prefer the standard syntax.
I often use macros to help with this kind of thing too.I also tend to use macros to define jump tables and so on, rather than doing them manually, meaning a simple macro system might not do
Yeah, I'm considering allowing user-defined JavaScript funcions. Nothing is more versatile than a full programming language at your disposal.(although if the assembler is written in JavaScript, it would be possible to support extensions also written in JavaScript without too much difficulty)
Re: Writing my own assembler
I like that idea a lot. So many times I'm torn between trying to wrestle macros into doing something that would be easier with a full programming language, and saying "forget it" and just running my own custom pre-processor (written in perl or python or something) over my code. This could be the best of both worlds.tokumaru wrote:Yeah, I'm considering allowing user-defined JavaScript funcions. Nothing is more versatile than a full programming language at your disposal.
My games: http://www.bitethechili.com
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: Writing my own assembler
I've been asking Soci for 1 for a year and a half, but we didn't really come up with a nice way to do it.. that looks perfectDrakim wrote:Darn, this thread is just making me itch to take a stab at making my assembler as well. I just might give it a go.
I've had some ideas floating around for a long time, so I figure I might share them here if you are interested. I'm not sure if all of these ideas are realistic, I haven't tried to implement them myself anywhere yet.
1. By far my most common label is a @Return: label in front of a nearby RTS statement. Maybe it's my style of coding, but I find that subroutines always has some branching conditions that exits early. So I realized it would be pretty nifty if I could just write a branch jump like this:
And the assembler would just treat a nearby RTS statement as a on-the-fly label destination (or throw an error if there are none within range), so you wouldn't have to put a @Return: label there. It's not really any new kind of functionality, just a sort of "auto-label" thing to make the code less verbose.Code: Select all
LDA MyVar BEQ RTS STA MyVar2
2. Sometimes as a programmer you can do things that the assembler can actually see is stupid. Like putting code in a bank, that uses a label from another bank (which is on the same page). Or if you have an absolute instruction with an Int instead of a label, which I'm guessing in 99.99% of cases is just the programmer forgetting a # symbol before the Int. It would be neat if the assembler could tell you about such mistakes.
3. Labels do a LOT of different jobs in asm code. They act as entry-points for subroutines. They act as starting offsets for data tables. And they act as holders of constant gameplay values. Sometimes I wish there was a way to mark a label as to what kind of job it does, and have the assembler throw an error at me if I'm trying to use it in a different way. So you can't JSR GHOST_ID since the label holds a constant value and not an address to a subroutine, and you can't LDA GHOST_INIT since the label holds an address to a subroutine. (Obviously sometimes you need to do tricky things like a RTS trampoline so there need to be a way to tell the assembler to not go bananas over it on specific lines).
Tass64 does 2 and 3 already, see -wImmediate and -wShadow warning options
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: Writing my own assembler
Yeah 65816 uses [] as well so you havetokumaru wrote:While square brackets for indirection makes a lot of sense in assembly (more than parentheses, I agree), there's just too much 6502 code out there using a standard that's probably as old as the CPU itself, and a change like that causes unnecessary confusion IMO.zzo38 wrote:I also like the nonstandard syntax used in NESASM/MagicKit (indirect addressing uses square brackets, and zero-page addressing is explicit), but maybe you prefer the standard syntax.
lda (zp),y
lda [zp],y
and those mean different things, so not wise to change the brackets, as it may cause issues and get people confused with other 65(X)XX lines.
See KickAss Assembler it is kind of a scripting language/assembler hybrid nobody really knows what it is, it kind of became a mess, but there are people who swear by it.tokumaru wrote:Yeah, I'm considering allowing user-defined JavaScript funcions. Nothing is more versatile than a full programming language at your disposal.zzo38 wrote:(although if the assembler is written in JavaScript, it would be possible to support extensions also written in JavaScript without too much difficulty)
Re: Writing my own assembler
Are you planning on making this open source?
Re: Writing my own assembler
I don't know, I first have to finish writing the thing, and I still have a long way to go. But being JavaScript, it'd probably be simpler to share the source than to package the native code generated by the V8 engine. I heard that there are tools to do that, but the performance is worse than simply using Node.js. Anyway, if there's any demand for it, I'll definitely consider it.
Re: Writing my own assembler
One thing that's been keeping me from moving forward with this is that I can't think of a good syntax to right-align code. In ASM6 you can do it with .ORG and label math, which's a bit cumbersome and pollutes the label table with stuff you don't need, so I really wanted to come up with a dedicated solution. One of the things that came to mind was changing the way .ORG works, so that not only it sets the PC for what comes after it, but also for what comes before it, if the PC is undefined at that point. If that was the case, you could write the following at the beginning of your source file:
And you'd get this:
Then, to be able to right-align code anywhere, all you'd need is a directive to "forget" the PC, so it can be defined by a future .ORG statement:
To me that's as clean as it gets, but I don't know what would happen if .BASE, another directive that changes the PC is used while the PC is undefined. I guess .BASE can also set the PC for the preceding code, but without padding. But what would .PAD do when the PC is undefined? Maybe I should get rid of .PAD and only use .ORG for padding if I really need to.
Anyway, can anyone think of better syntax for right-aligning code?
Code: Select all
Label:
jmp Label
.org $10000
Code: Select all
$fffd: jmp $fffd
Code: Select all
.org $8000
;code starting at $8000 goes here
.forgetpc
;code to right-align to $10000 goes here
.org $10000
Anyway, can anyone think of better syntax for right-aligning code?
Re: Writing my own assembler
I did think of simple solutions for other problems though:
Repeated labels: I will simply allow labels to repeat if they're defined with two colons rather than one (i.e. SomeLabel:: instead of SomeLabel:). This seemed like a good solution because regular labels will keep working the same way, and users can choose which labels can be reassigned. This is also really easy to implement. You just have to be careful when using these labels, because the assembler will not check if the multiple addresses are the same.
Local label scope: The only real problem I have with local scopes being delimited by global labels is that sometimes you need part of a subroutine to be above the global label that defines the entry point. To solve this in a non-intrusive way, I decided to create a directive that explicitly starts a new scope, but the name of that scope is defined by the next global label that's found. It works like this:
If you don't use the new directive, global labels will continue to start new scopes, as usual.
Repeated labels: I will simply allow labels to repeat if they're defined with two colons rather than one (i.e. SomeLabel:: instead of SomeLabel:). This seemed like a good solution because regular labels will keep working the same way, and users can choose which labels can be reassigned. This is also really easy to implement. You just have to be careful when using these labels, because the assembler will not check if the multiple addresses are the same.
Local label scope: The only real problem I have with local scopes being delimited by global labels is that sometimes you need part of a subroutine to be above the global label that defines the entry point. To solve this in a non-intrusive way, I decided to create a directive that explicitly starts a new scope, but the name of that scope is defined by the next global label that's found. It works like this:
Code: Select all
.scope ;starts a new scope, but we don't know what the parent label is yet
.return:
rts
Ignore45: ;oh, so this is the parent label in this scope
cmp #45
beq .return
;rest of subroutine
Re: Writing my own assembler
I don't know what "right-aligning code" means in this context. To me that just looks like padding used for some form of alignment.
Have you looked at x816's documentation? The implementation/model there should alleviate some of your concerns/issues here, and relieve you of your blocker regarding what to do if someone specifies code before the very first .org directive:
This is really the best choice. Honest. The proposal you have (to allow code specified before the first .org, but based on what that .org line says) makes no sense and will confuse everyone who uses this tool. Likewise, .forgetpc makes absolutely no sense -- there's no need for it, just let .org dictate things, and don't allow people to write actual code before the first .org statement. Problem solved.
A copy of x816's manual is here, along with several other manuals from assemblers. Just remember that x816 was intended for 65816 (which supports 24-bit addressing and native banks), but it should give you some good ideas on how to do things (like .base and how to handle some scope-related bits) -- see x816-v122f-norman-yen.txt: https://drive.google.com/open?id=1kcEKU ... Xcj9vKDRIR
Edit 2020/02/11: update link from Dropbox to pCloud
Edit 2020/06/28: update link from pcloud to Google Drive
Have you looked at x816's documentation? The implementation/model there should alleviate some of your concerns/issues here, and relieve you of your blocker regarding what to do if someone specifies code before the very first .org directive:
Code: Select all
.ORG
Define origin address.
Sets the starting address of the source file. X816 will
not assemble any code until this directive is found.
A copy of x816's manual is here, along with several other manuals from assemblers. Just remember that x816 was intended for 65816 (which supports 24-bit addressing and native banks), but it should give you some good ideas on how to do things (like .base and how to handle some scope-related bits) -- see x816-v122f-norman-yen.txt: https://drive.google.com/open?id=1kcEKU ... Xcj9vKDRIR
Edit 2020/02/11: update link from Dropbox to pCloud
Edit 2020/06/28: update link from pcloud to Google Drive
Last edited by koitsu on Sun Jun 28, 2020 10:23 pm, edited 2 times in total.
Re: Writing my own assembler
Right aligning means aligning code to an upper address, useful when you use a mapper that swaps the entire 32KB and you need to simulate a fixed bank near the CPU vectors, containing a reset stub, trampoline routines and other things.
No assembler I know of is equipped to do this easily, so people either use cumbersome hacks, or definine a constant size for their "fixed" banks, solutions that are far from optimal.
Also, I disagree that my proposed solutions are confusing, because I'm intentionally trying to think of solutions that don't affect the common ways of doing things. Don't like the new directives? Don't use them, and things will behave as they always did (as much as there is a standard for these things, anyway). But even if I was changing things radically, I made it very clear since the beginning that this isn't meant to please anyone, this is mostly for my own use.
No assembler I know of is equipped to do this easily, so people either use cumbersome hacks, or definine a constant size for their "fixed" banks, solutions that are far from optimal.
Also, I disagree that my proposed solutions are confusing, because I'm intentionally trying to think of solutions that don't affect the common ways of doing things. Don't like the new directives? Don't use them, and things will behave as they always did (as much as there is a standard for these things, anyway). But even if I was changing things radically, I made it very clear since the beginning that this isn't meant to please anyone, this is mostly for my own use.
Re: Writing my own assembler
This sounds pretty nice, and I could definitely see myself using it. That said, why does the simulated fixed bank need to be at the upper-end near the vectors? I just always put mine first-thing (ie left-aligned). Is there some disadvantage of how I'm doing it? (asking in good faith, not trying to pick nits and argue)tokumaru wrote:Right aligning means aligning code to an upper address, useful when you use a mapper that swaps the entire 32KB and you need to simulate a fixed bank near the CPU vectors, containing a reset stub, trampoline routines and other things.
My games: http://www.bitethechili.com
Re: Writing my own assembler
To me personally, it makes sense to put the fixed stuff up there because of the CPU vectors, which are in the same category (i.e. thing that must be present in all banks), but what seals the deal for me is that I use the beginning of the bank for subroutines with timed code, or data that has to be aligned to memory pages for timing reasons, because it's easier align code/data to page boundaries there.gauauu wrote:That said, why does the simulated fixed bank need to be at the upper-end near the vectors? I just always put mine first-thing (ie left-aligned). Is there some disadvantage of how I'm doing it?
Re: Writing my own assembler
That might be confused with RGBDS's double colon export syntax. man 5 rgbasm says these are equivalent:tokumaru wrote:Repeated labels: I will simply allow labels to repeat if they're defined with two colons rather than one (i.e. SomeLabel:: instead of SomeLabel:).
Code: Select all
SomeLabel::
;is the same thing as this
SomeLabel:
export SomeLabel