Writing my own assembler

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

User avatar
pubby
Posts: 583
Joined: Thu Mar 31, 2016 11:15 am

Re: Writing my own assembler

Post by pubby »

cpow wrote: I'll never understand the logic of "I can write a better parser from scratch than any of the parser generators available to me." Like any tool, a parser generator allows you to focus on the meat and potatoes, not the plate.
Generators are cool and all, but I don't think many languages use them nowadays. The big players tend towards hand-written recursive descent, as it's so much more flexible and provides better error handling. Plus they're easy to write.
User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Writing my own assembler

Post by Bregalad »

When I wrote my CompressTool utility which is basically an assembler without opcodes (only .db statements are supported) (*), I programmed everything from scratch. It would have been more complex to learn to use a parser correctly than to write my own, for the simple things I needed to do.

(*) That assembler also has the utility that you can compress data on the fly which is quite useful, especially for pointers/references within data. It also supports changing the character mapping, which is a rather easy feature to support, although supporting UTF-8 -> 8-bit character maping is a bit more tricky.
User avatar
cpow
NESICIDE developer
Posts: 1097
Joined: Mon Oct 13, 2008 7:55 pm
Location: Minneapolis, MN
Contact:

Re: Writing my own assembler

Post by cpow »

Bregalad wrote:It would have been more complex to learn to use a parser correctly than to write my own, for the simple things I needed to do.
I'll stop dropping my opinions on random threads. :beer:
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Writing my own assembler

Post by tokumaru »

Opinions are valuable, even if everyone disagrees with them. :wink:

I agree that the amount of parsing in an assembler is fairly minor, and for those of us who don't already know how to integrate an existing parser, it may actually be faster to code our own. It doesn't have to be "better" than what's available, it just has to the work you need to be done.
Drakim
Posts: 97
Joined: Mon Apr 04, 2016 3:19 am

Re: Writing my own assembler

Post by Drakim »

Darn, this thread is just making me itch to take a stab at making my assembler as well. I just might give it a go. :D

I've had some ideas floating around for a long time, so I figure I might share them here if you are interested. I'm not sure if all of these ideas are realistic, I haven't tried to implement them myself anywhere yet.

1. By far my most common label is a @Return: label in front of a nearby RTS statement. Maybe it's my style of coding, but I find that subroutines always has some branching conditions that exits early. So I realized it would be pretty nifty if I could just write a branch jump like this:

Code: Select all

LDA MyVar
BEQ RTS
STA MyVar2
And the assembler would just treat a nearby RTS statement as a on-the-fly label destination (or throw an error if there are none within range), so you wouldn't have to put a @Return: label there. It's not really any new kind of functionality, just a sort of "auto-label" thing to make the code less verbose.

2. Sometimes as a programmer you can do things that the assembler can actually see is stupid. Like putting code in a bank, that uses a label from another bank (which is on the same page). Or if you have an absolute instruction with an Int instead of a label, which I'm guessing in 99.99% of cases is just the programmer forgetting a # symbol before the Int. It would be neat if the assembler could tell you about such mistakes.

3. Labels do a LOT of different jobs in asm code. They act as entry-points for subroutines. They act as starting offsets for data tables. And they act as holders of constant gameplay values. Sometimes I wish there was a way to mark a label as to what kind of job it does, and have the assembler throw an error at me if I'm trying to use it in a different way. So you can't JSR GHOST_ID since the label holds a constant value and not an address to a subroutine, and you can't LDA GHOST_INIT since the label holds an address to a subroutine. (Obviously sometimes you need to do tricky things like a RTS trampoline so there need to be a way to tell the assembler to not go bananas over it on specific lines).
User avatar
cpow
NESICIDE developer
Posts: 1097
Joined: Mon Oct 13, 2008 7:55 pm
Location: Minneapolis, MN
Contact:

Re: Writing my own assembler

Post by cpow »

Drakim wrote:

Code: Select all

LDA MyVar
BEQ RTS
STA MyVar2
Wouldn't one label anywhere in your code suffice?

Code: Select all

AlwaysJustReturnFromThisLabel: RTS
...
LDA MyVar
BEQ AlwaysJustReturnFromThisLabel
STA MyVar2
What am I missing? [Must be something obvious.] Of course, your label could be _RTS: or something. But that depends on [dare I go back to it] whether your parser is able to ignore keywords as part of symbols.
Drakim
Posts: 97
Joined: Mon Apr 04, 2016 3:19 am

Re: Writing my own assembler

Post by Drakim »

cpow wrote:Wouldn't one label anywhere in your code suffice?
Relative branches can only jump a certain distance in your code. -126 to +129 bytes worth of opcodes I believe? You definitely won't be able to have one "global" RTS that you reuse all the time. That's why different branches all need to find their own nearby RTS instruction, which is something the assembler could do for you.

You can go anywhere with a vanilla JMP though, so it could use a global RTS.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Writing my own assembler

Post by tokumaru »

Yeah, branches have extremely limited reach.
Drakim wrote:You can go anywhere with a vanilla JMP though, so it could use a global RTS.
Why would you JMP to an RTS and waste 3 bytes and 3 cycles if you can RTS on the spot with 1 byte?
Drakim
Posts: 97
Joined: Mon Apr 04, 2016 3:19 am

Re: Writing my own assembler

Post by Drakim »

tokumaru wrote:Why would you JMP to an RTS and waste 3 bytes and 3 cycles if you can RTS on the spot with 1 byte?
Haha, good point! I was only thinking in terms of cpow's idea for a global RTS, which is possible for a JMP, but as you say, utterly pointless.
Hangin10
Posts: 37
Joined: Thu Jun 04, 2009 9:07 am

Re: Writing my own assembler

Post by Hangin10 »

Drakim wrote:1. By far my most common label is a @Return: label in front of a nearby RTS statement. Maybe it's my style of coding, but I find that subroutines always has some branching conditions that exits early. So I realized it would be pretty nifty if I could just write a branch jump like this:

Code: Select all

LDA MyVar
BEQ RTS
STA MyVar2
And the assembler would just treat a nearby RTS statement as a on-the-fly label destination (or throw an error if there are none within range), so you wouldn't have to put a @Return: label there. It's not really any new kind of functionality, just a sort of "auto-label" thing to make the code less verbose.
What would be less ideal about an opposite branch over RTS right there (maybe hide it in a macro)?
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Writing my own assembler

Post by tokumaru »

I have tons of "return" labels myself, but I don't think I'd create this kind of exception just to save a little bit of typing.

I'm not opposed to verbosity in general, I'm opposed to error-prone verbosity and redundant verbosity.
Drakim
Posts: 97
Joined: Mon Apr 04, 2016 3:19 am

Re: Writing my own assembler

Post by Drakim »

Hangin10 wrote:What would be less ideal about an opposite branch over RTS right there (maybe hide it in a macro)?
I'll write an example to demonstrate how things become less verbose:

Code: Select all

DoTheMario:
  LDA DancerId
  CMP #MarioId
  BNE @Return
  ; Lots of dancing code here
  @Return:
  RTS
Turns into...

Code: Select all

DoTheMario:
  LDA DancerId
  CMP #MarioId
  BNE RTS
  ; Lots of dancing code here
  RTS
Or if the dancing code is so big that we can't get there in one jump, we might have to have the RTS above our label, which is super annoying depending on your assembler. Some assemblers can make it easier, but this is the best you can get:

Code: Select all

  -Return:
  RTS
DoTheMario:
  LDA DancerId
  CMP #MarioId
  BNE -Return
  ; Too much dancing code here for a branch jump
  RTS
Turns into...

Code: Select all

  RTS
DoTheMario:
  LDA DancerId
  CMP #MarioId
  BNE RTS
  ; Too much dancing code here for a branch jump
  RTS
It's no big revolution, but it eliminates some "braindead" labels. I'm not sure if you could implement the same with a macro?
Hangin10
Posts: 37
Joined: Thu Jun 04, 2009 9:07 am

Re: Writing my own assembler

Post by Hangin10 »

Nevermind me. I get it now and could see it going either way.
Rahsennor
Posts: 479
Joined: Thu Aug 20, 2015 3:09 am

Re: Writing my own assembler

Post by Rahsennor »

Drakim wrote:I'm not sure if you could implement the same with a macro?
Fasm can use its load operator to hunt for a suitable byte within range.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Writing my own assembler

Post by tepples »

Hangin10 wrote:What would be less ideal about an opposite branch over RTS right there (maybe hide it in a macro)?
Loss of 1 byte, plus loss of 1 cycle if the no-return case is less likely than the return case. An assembler-level "find the nearest RTS instruction" feature would make it almost as convenient and efficient as conditional return on an 8080, LR35902, Z80, or ARM.
Post Reply