It is currently Wed Nov 14, 2018 12:22 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 26 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Wed Oct 17, 2018 11:00 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3676
Location: Mountain View, CA
rainwarrior wrote:
Anyhow, sorry this seems to be quite a digression. The discussion was talking about appropriate practice for using < or z: etc. because the question was how does ca65 know about sizes of values. What a theoretical assembler should do is kind of a completely different problem.

I can talk about non-theoretical assemblers if you want, and give actual screenshots of concrete documentation showing exactly what they do. But maybe not here?

This general subject comes up (re: operand lengths + addressing modes, and how the assemblers decide which to use, and how they "determine lengths") every few months. It will continue to come up as long as we have new people coming into the scene and learning how assemblers behave.


Top
 Profile  
 
PostPosted: Wed Oct 17, 2018 11:29 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6948
Location: Canada
Well, one way I could interpret what you said that would make sense to me is:
"I want all instruction pnemonics to have an explicit operand size (suffix), and I want the shortest pnemonic (no suffix) to be ZP size."

Is that what you meant by default? Like I can understand the desire for an assembly syntax with only explicit operand size and no assumptions. It definitely wouldn't be my preference, at least not for 6502, but I could understand someone wanting that.

(I guess NESASM is technically that but with ABS as the base pnemonic and < as an operand prefix instead of an instruction suffix... it has other problems encircling the issue, but it is at least explicit in this respect.)

If you want to show examples, I wouldn't mind, but really I just wanted to know what you meant, the question of "what assemblers do X" was just a means to try and understand that via example, if one exists.


Top
 Profile  
 
PostPosted: Wed Oct 17, 2018 1:55 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3676
Location: Mountain View, CA
tokumaru generally summed up what I meant. I've done a large write-up about all of this verbosely, but decided to save it to a txt file for a rainy day, or more likely a wiki page.


Top
 Profile  
 
PostPosted: Wed Oct 17, 2018 2:13 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6948
Location: Canada
Forgive me, but the statement that confused me was: "forcing ZP addressing through syntax semantics is the backwards approach to take -- instead, default to ZP and forcing absolute/16-bit addressing through semantics."

That's all I was asking for clarification for, and I'm not quite sure what kind of answer you would have typed out that needs to be a large write up... I wasn't really asking for you to prove that some assembler does it, I just asked in case an example would help.


"forcing ZP addressing through syntax semantics"

I have a hard time understanding which methods they're referring to. Are you describing an assembler, or programming convention, and if it's a programming convention is it for ca65 or are you talking about many assemblers at once? Is this about < as a ZP forcing prefix? Is this about z:? Is this about : zeropage segments? I don't know which ones are covered by the statement, and that's what I was hoping you'd clarify.


"default to ZP and forcing absolute/16-bit addressing through semantics"

How do you force it through semantics? Instruction suffix? a: prefix? What does "default" mean in this context? Again not sure if talking about how you want an assembler to behave, or how a programmer should write code, or whether this is for ca65 or if you're talking about something more general.


I was genuinely confused by both halves of this statement, and really I just wanted to know what you were trying to express with it. It might sound like I was trying to shoot it down, maybe the way I talk comes across like that (apologies, if it does) but if you had something good to share in there I was hoping I could get you to spell it out with my questions, because I really couldn't parse that sentence as it was. I've made two guesses as to what you meant and you haven't really directly confirmed if either of those guesses were correct, but it sounds like not? (Tokumaru's post doesn't really make the connection to your sentence for me, either, sorry.)

...and if it really is a thing that requires too much effort to explain, or you don't want to explain for whatever reason, that's OK too, but it feels like maybe you felt I was pushing you in a way that I did not at all hope to.


Top
 Profile  
 
PostPosted: Wed Oct 17, 2018 2:55 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3676
Location: Mountain View, CA
Okay, here's the short version:

"Default to ZP and forcing absolute/16-bit addressing through semantics" means:

Assuming the addressing mode in question supports ZP addressing: if the calculated effective address is $0000 to $00ff, then use ZP addressing with the low byte of the effective address. Example: for lda $00ee, assemble to a5 ee. The only exception to this rule is if the user explicitly specifies they wanted absolute/16-bit addressing through syntactical sugar; common prefixes are ! or a:. Example: lda !$00ee would assemble to ad ee 00.

"Forcing ZP addressing through syntax semantics" could mean anything because it depends on the assembler. I intentionally worded this nebulously because "syntax semantics" vary per assembler. That said: I'll give you two examples (one I know of factually, the other might be true but I haven't checked): 1) using < to refer to the low byte of an effective address when combined with some opcodes could result in ZP addressing, ex. lda <$12ee might assemble to a5 ee, and 2) z: prefix described earlier (if that's true).

I would love to see a non-obfuscated example (meaning: I don't want to see 20 lines of macros and weird variable equate magic and complex stuff) where defaulting to 16-bit/absolute addressing, when the effective address is $0000 to $00ff, is preferred. I'll add that at least one Apple II assembler (ORCA/M) actually does this (defaults to absolute), HOWEVER, that's an assembler intended for 65816 (using it to write 6502 actually requires you to specify "sizes" on a lot of operands solely for that reason). ORCA/M's manually actually outlines very very clearly what the assembler does with all the different syntaxes, in a nice chart, that spreads across 2 or 3 pages. (My aforementioned txt file was basically a version of this but WRT 6502 and what we're discussing here; there are several effective addresses that could be assembled multiple ways, which I noted as "difficult" or "I can't decide").


Top
 Profile  
 
PostPosted: Wed Oct 17, 2018 3:17 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6948
Location: Canada
To clarify the difference between < and z: and a: in ca65:

< takes the low byte of the operand. Will produce a ZP instruction, but there's no protection if your label actually needed ABS.

z: verifies that the operand is a ZP address, and throws an error if it doesn't. It's verification of correctness, not forcing.

a: promotes a ZP operand to ABS size, or keeps an ABS operand at the same size. (i.e. for when you don't want the normal behaviour of assuming ZP where it can be assumed.)


So I guess ca65's behaviour is actually what you were describing, as long as you're not using < to mean "ZP instruction", which a lot of people were used to doing from NESASM, and I think is not a good habit to cultivate, since z: instead will help protect against errors. ...but in general there's very little need to use z:, it's really just a tool for checking that you're really getting the addressing you want in the cases where it matters. (< is a good operator for its purpose of taking a low byte, but not a substitute for z:, IMO.)


Looking back I realize OP didn't explicitly mention ca65, but I knew previously that its what they're working with, so maybe that was a strongly assumed context on my part that others didn't have.


Top
 Profile  
 
PostPosted: Wed Oct 17, 2018 3:55 pm 
Offline

Joined: Wed Aug 16, 2017 12:15 am
Posts: 40
Location: Finland
samophlange wrote:
The first thing I realized is that I don't know how to handle constants. Specifically, I don't know how to write code that can tell if a constant is a single byte vs two bytes and then treat it appropriately.

Perhaps the following thoughts help those who aren't very experienced with assemblers. It may be off topic or self-evident to you, but I didn't really stop to think about it until quite recently. Note: I haven't really studied compiler/assembler theory.

Most assemblers make no distinction between what the programmer calls "variables", "labels" or "constants". They are all just constants - they have a name and a value. (Edit: they have different syntax but I think they're internally the same.)
- "Constants" are simple: the value is just a value and nothing else.
- The value of a "label" is an address; the assembler knows both the address and what's stored there (assuming ROM and no bankswitching).
- The value of a "variable" is an address, too, but the assembler only knows the address, not what's stored there (because the assembler does not simulate what the 6502 program does).
The assembler is capable of very complex arithmetic but only at assembly time; at run time, the 6502 won't and can't distinguish the program from a manually-written one.

_________________
My NES utilities and programs on GitHub


Top
 Profile  
 
PostPosted: Wed Oct 17, 2018 4:44 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10961
Location: Rio de Janeiro - Brazil
rainwarrior wrote:
z: verifies that the operand is a ZP address, and throws an error if it doesn't. It's verification of correctness, not forcing.

It is forcing if the address can't be calculated at the time the instruction is assembled. Kinda. It'll still throw an error if the address is ultimately found to not fit in 8 bits, of course.


Top
 Profile  
 
PostPosted: Wed Oct 17, 2018 7:03 pm 
Offline
User avatar

Joined: Sun Apr 08, 2018 11:45 pm
Posts: 18
Location: Southern California
I think I've only understood half of what has been talked about in this thread, but I still know twice as much as I did before! :lol:

qalle wrote:
Most assemblers make no distinction between what the programmer calls "variables", "labels" or "constants". They are all just constants - they have a name and a value. (Edit: they have different syntax but I think they're internally the same.)


Yeah, the sticking point for me is that the number of bytes used for a variable changes how you work with the variable, (and maybe how you interpret the value if it is signed), but constants don't have a "size" that you can determine apart from their value. This is compounded by the fact that I'm both new to the language and also the capabilities of ca65.

I tried two approaches that I thought would work, but resulted in an error. Then I tried another approach that I assumed would give me an error but actually worked.

This yielded the error "Error: Constant expression expected"

Code:
tempw0: .res 2
CONSTANT_1 = 1

.macro u16_set_from_constant setvar, constant
.if (constant >= 0 && constant <= 255)
    lda #<constant
    sta setvar+0
    lda #$00
    sta setvar+1
.elseif (constant >= 255 && constant <= 65535)
    lda #<constant
    sta setvar+0
    lda #>constant
    sta setvar+1
.else
    .error "constant is out of range"
.endif
.endmacro

; macro called from some unit test subroutine
u16_set_from_constant tempw0, CONSTANT_1



This approach gave the error "Error: Size of `CONSTANT_1' is unknown"

Code:

tempw0: .res 2
CONSTANT_1 = 1

.macro u16_set_from_constant setvar, constant
.if (.sizeof(constant) == 1)
    lda #<constant
    sta setvar+0
    lda #$00
    sta setvar+1
.elseif (.sizeof(constant) == 2)
    lda #<constant
    sta setvar+0
    lda #>constant
    sta setvar+1
.else
    .error "constant is out of range"
.endif
.endmacro

; macro called from some unit test subroutine
u16_set_from_constant tempw0, CONSTANT_1



This one worked fine, somewhat to my surprise. I expected using ">" with a constant that wouldn't logically be a 2 byte value to result in a compile time error or some garbage values at runtime, but it works.

Code:
.macro u16_set_from_constant setvar, constant
    lda #<constant
    sta setvar+0
    lda #>constant
    sta setvar+1
.endmacro


So I guess in this case ca65 knows that there is no "value" in the "high byte" and generates a value of $00 for the immediate mode operation. Maybe that should have been obvious to me, but I thought I was telling it to do something dumb and expected it to just let me shoot myself in the foot. But it works! :beer:


Top
 Profile  
 
PostPosted: Wed Oct 17, 2018 7:38 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6948
Location: Canada
All of ca65's expressions have 32-bit size, so they always have a 2nd byte accessible by >. (They also have a 3rd and 4th byte. .bankbyte gets the 3rd byte.) Your last macro is IMO the "correct" way to do this.

The macro system is a bit weird in a lot of ways. Notably it's not a preprocessor; the macro gets tokenized and that stream of tokens is substituted instead, and there's a lot of strange restrictions about what needs to be known when. (Also be careful about parentheses in macros, since those have some additional meaning in the assembly context. {} can sometimes be used instead. Some examples here.)

The .sizeof directive doesn't really intuitively map to how it works in C, and probably doesn't work on constants at all.

I'm not sure what's going on with the first attempt though. (The error isn't even necessarily about one of the lines with "constant" in it. I personally find it really hard to diagnose problems with ca65 macros; usually involves a lot of removing things until you find the line that caused the error.)


Top
 Profile  
 
PostPosted: Thu Oct 18, 2018 5:41 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20764
Location: NE Indiana, USA (NTSC)
samophlange wrote:
constants don't have a "size" that you can determine apart from their value.

In some type systems, they do. In the C language, (uint8_t)5 and (uint64_t)5 have different sizes.

As rainwarrior mentioned, .sizeof doesn't do what you expect. .addrsize, available in fairly recent ca65, might be closer. There are also some peculiar scope resolution restrictions for accessing top-level constants within a named scope (.scope or .proc): you'll often need to use ::symbol to force use of the top-level symbol rather than a shadowing symbol of the same name that ca65 thinks may be defined later in the scope.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 26 posts ]  Go to page Previous  1, 2

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group