DISASM6 v1.5 - Nes oriented disassembler producing asm6 code

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

User avatar
loopy
Posts: 405
Joined: Sun Sep 19, 2004 10:52 pm
Location: UT

Post by loopy »

Tepples is right. Whether you enter 6, $6, $06, $0006, or $000000006, asm6 sees it as just a number. If it's <256, a ZP instruction is emitted. I'm open to the idea of having a forced absolute address. Any suggestions on syntax?

Until something is implemented, making a macro is an easy workaround (as Dwedit already suggested)
frantik
Posts: 377
Joined: Tue Mar 03, 2009 3:56 pm

Post by frantik »

the most intuitive way would be if it's something like LDA $00FF it uses absolute but if it's LDA $FF then use ZP.

if it has to be some kind of special syntax, maybe quote marks? to say, "yes i really mean this!!"

LDA "$00FF"

but i think being able to leave them out would make more sense.. perhaps some kind of directive to enable/disable the mode?

it's not something the average programmer needs to worry about but for disassembling it would be nice to represent all valid lines of code without resorting to hex/db/etc

or maybe something like

LDA word($00FF)
LDA byte($00FF)

to force ZP or ABS


(btw for those who missed it at the bottom of page 2 i updated the code :))
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

frantik wrote:the most intuitive way would be if it's something like LDA $00FF it uses absolute but if it's LDA $FF then use ZP.
Remember that unlike disassembled code, normal programs rarely (ideally, never) use explicit addresses, they use labels, so this is not an option.

I'm not sure what the best solution is... Some assemblers use something like "LDA.W Variable" for this, don't they? It looks weird though, since with the 6502 this is really the only case where you'd ever want to force an addressing mode or the other. I would like this feature, but I have no idea about the best syntax.

In my own programs I usually .db the opcode and then .dw the label (and put the actual instruction as a comment), or I add $800 to the address in order to use a mirror or ZP at $800-$8FF (this would definitely not be OK in a disassembly).
frantik
Posts: 377
Joined: Tue Mar 03, 2009 3:56 pm

Post by frantik »

tokumaru wrote:
frantik wrote:the most intuitive way would be if it's something like LDA $00FF it uses absolute but if it's LDA $FF then use ZP.
Remember that unlike disassembled code, normal programs rarely (ideally, never) use explicit addresses, they use labels, so this is not an option.
yeah it would only work with explicit values not labels.. some kind of way to force word/byte seems like the next best option.. either

LDA.W label
LDA word(label)

seem to make the most sense. i kinda like the 2nd one better myself
tomaitheous
Posts: 592
Joined: Thu Aug 28, 2008 1:17 am
Contact:

Post by tomaitheous »

If you change behavior to treat $00ff to be read as ABS And $ff to be read as ZP, wouldn't that break existing compatibility with the assembler projects?

Why not a switch at the start of the source file, to tell the assembler how to treat addressing modes and labels/values. And there's already symbol mechanism for selecting high/low byte of a word or label, right?
3gengames
Formerly 65024U
Posts: 2284
Joined: Sat Mar 27, 2010 12:57 pm

Post by 3gengames »

Maybe an -optimizezp switch would help? or maybe something inside the assembler, assuming optimizing ZP instructions, and when it's not wanted/needed some thing like .nozpoptimize flag inside the assembler?


Just my ideas, nice to see this being added! :D
frantik
Posts: 377
Joined: Tue Mar 03, 2009 3:56 pm

Post by frantik »

added initial support for fceudx data/code maps.. actually pretty cool :D.. working on other changes too and will release the new version later
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Post by koitsu »

loopy wrote:Tepples is right. Whether you enter 6, $6, $06, $0006, or $000000006, asm6 sees it as just a number. If it's <256, a ZP instruction is emitted. I'm open to the idea of having a forced absolute address. Any suggestions on syntax?

Until something is implemented, making a macro is an easy workaround (as Dwedit already suggested)
asm6 is trying to do code optimisation without the programmer's consent. This is, for lack of better word, inappropriate.

Let me make it crystal clear so there's no misunderstandings:

* If someone types "LDA $00FF", the assembled result should be AD FF 00.
* If someone types "LDA $FF", the assembled result should be A5 FF.
* If someone types "label EQU $00FF" and "LDA label", the assembled result should be AD FF 00.
* If someone types "label EQU $FF" and "LDA label", the assembled result should be A5 FF.

This is how all classic Apple II assemblers (Merlin 8, Merlin 16, and ORCA/M) did it. The reason it was done that way: assembly code, when written by a person, by nature should be KISS -- that is to say, the assembler should not try to do "smart things" without the programmer's prior consent. Nobody is going to get me to bend on this one, period. Assembly is a different beast than higher level languages.

If you truly disagree with this, then there's a logical alternative: add a flag (proposals: -O (dash capital oh, a la cc/gcc), -fXXX (a la gcc), or -optimizezp (but please support -optimisezp for those of us who spell normally ;) )) which enables such optimisations. Without the flag, it should behave like I described above. And please be sure to explain in the README what the flag does/affects.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Post by tepples »

loopy wrote:Tepples is right. Whether you enter 6, $6, $06, $0006, or $000000006, asm6 sees it as just a number. If it's <256, a ZP instruction is emitted. I'm open to the idea of having a forced absolute address. Any suggestions on syntax?
A 65816-inspired suggestion:

Code: Select all

lda $00FF,d  ; generates zero page
lda $00FF,b  ; generates absolute
The reason for the names 'D' and 'B' becomes clear when you consider how these instructions operate on a 65816: address,d is an 8-bit offset from D (direct page base pointer, frozen at 0 on 6502; compare TAD instruction), and address,b is a 16-bit offset from B (data bank base pointer; compare PLB instruction). A directive would switch between ,d and ,b being the default for labels less than $100.

And if you extend this 65816 analogy, setting the CPU type to 65816 would unlock another syntax:

Code: Select all

lda far $00FF  ; generates long absolute
Thus 'jmp far' and 'jsr far' would become synonyms for 'jml' and 'jsl'.

I don't like frantik's suggestion of .W, as it too closely resembles the syntax for data size, not address size, in 68K (MOVE.W) and ARM (LDRH, LDRB) assembly language.
koitsu wrote:* If someone types "LDA $00FF", the assembled result should be AD FF 00.
* If someone types "LDA $FF", the assembled result should be A5 FF.
[...]
This is how all classic Apple II assemblers (Merlin 8, Merlin 16, and ORCA/M) did it.
As I understand it, the mini-assemblers in Integer BASIC, the enhanced IIe monitor, and the IIGS monitor acted this way, but only because they didn't support labels or decimal numbers. How would the assembler treat decimal labels (e.g. label EQU 120)?
* If someone types "label EQU $00FF" and "LDA label", the assembled result should be AD FF 00.
* If someone types "label EQU $FF" and "LDA label", the assembled result should be A5 FF.
So you're choosing a data type for the label based on the length of the constant's literal value. That sort of resembles the distinction between near and far pointers on 8086. How would it treat computed labels (e.g. label EQU somebase + someoffset)?
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Post by tokumaru »

koitsu wrote:* If someone types "label EQU $00FF" and "LDA label", the assembled result should be AD FF 00.
* If someone types "label EQU $FF" and "LDA label", the assembled result should be A5 FF.
There are much better ways to declare variables than using EQU. I always use ENUM and DSB for that, so that rearranging them in the future doesn't require changing a shitload of EQUs. Also, I might want to access a certain variable using ZP addressing most of the time, but in a timed section of my code I might need to access it a bit slower. Your solution doesn't work for these cases.

IMO, the addressing mode shouldn't be decided based on the declaration at all. Using EQU to declare variables is not considered a good practice anymore, and there might be the need to access the same variable using different addressing modes. This selection should be made on an instruction-per-instruction basis, and it should be possible to select a default behavior.
frantik
Posts: 377
Joined: Tue Mar 03, 2009 3:56 pm

Post by frantik »

tepples wrote: I don't like frantik's suggestion of .W, as it too closely resembles the syntax for data size, not address size, in 68K (MOVE.W) and ARM (LDRH, LDRB) assembly language.
tokumaru mentioned it :-p
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Post by koitsu »

Regarding how older assemblers like Merlin 8/16 and ORCA/M did it: they both differ in methodology, and it turns out my memory has failed me once again. People aren't going to like the Merlin 8/16 method, believe me (and I don't approve of it either, but it's how it worked). I'll document both for everyone's sake.

Merlin 8/16 (quoting the manual):
There is no difference in syntax for zero page and absolute modes. The assembler automatically uses zero page mode when appropriate. Merlin 8/16 provides the ability to force non-zero page addressing. The way to do this is to add anything except D in Merlin 8, or L in Merlin 16, to the end of the opcode. Example:

LDA $10 assembles as zero page (2 bytes: A5 10)

while,

LDA: $10 assembles as non-zero page (3 bytes: AD 10 00)


Also, in the indexed indirect modes, only a zero page expression is allowed, and the assembler will give an error message if the "expr" does not evaluate to a zero page address.

...

Merlin 8/16 will decide the legality of the addressing mode for any given opcode.
Be sure to note the colon in "LDA:" above. You could literally say "LDAX $10" and accomplish the same thing. When the manual says "add anything", they mean "add any character of your choice". Yup, you read that correctly. And please don't start aspie'ing over what the assembler would do with something like "LDA:" (is that a label or an opcode, etc...). The simple answer is: Don't Do That(tm).

I'll provide ORCA/M details later, I'll need to go down to the garage and dig through boxes to find the manual. Expect me to edit this post once I find it.
frantik
Posts: 377
Joined: Tue Mar 03, 2009 3:56 pm

Post by frantik »

i generally agree with koitsu that if a label is defined as a word, it should be treated as a word...

Label_00FF = $00FF
enum $A0
Label_00A0 .word 0000
enum
Label_00C0 EQU $00C0

all of the above should be treated as non-ZP


if that's not possible, something like this makes sense to me, but it's not really asm friendly

LDA word(label)
LDA byte($00FF)

or maybe just

LDA .word label


btw i might make sense to split the topic since most of the discussion is about asm6, no dasm6 :D
tomaitheous
Posts: 592
Joined: Thu Aug 28, 2008 1:17 am
Contact:

Post by tomaitheous »

LDA word(label)
LDA byte($00FF)
Ouu :3 I like that

Reminds me of lda high(label or address or whatever) or lda low( same ) in PCEAS (NESASM too, right?). Looks more natural than appending .b or.w which is more 68k style of assembler syntax.
frantik
Posts: 377
Joined: Tue Mar 03, 2009 3:56 pm

Post by frantik »

released v1.2 which adds support for FCEUDX code/data logs and user definable labels.

i guess now i have to start looking into how to handle mappers :\


to define labels use the format

label = $0000

comments starting with a semicolon and invalid lines are ignored. Decimal (no modifier), hex ($) and binary (%) supported
Post Reply