It is currently Tue Nov 13, 2018 9:28 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 14 posts ] 
Author Message
PostPosted: Fri Oct 05, 2018 12:16 pm 
Offline

Joined: Tue Aug 28, 2018 8:54 am
Posts: 58
Location: Edmonton, Canada
So I'm trying to wrap my head around, on how to use local variables efficiently and not make mistakes.

It is not really important is it assembler of C, because cc65 does not encourage using local stack variables anyway. I don't say I really like x86, but arbitrary stack access in 6502 would be nice... Or at least fast stack.

Anyway, from what I learnt and saw, regardless of the compiler/assembler the only sane way is to use global variables.

Pseudo code example
Code:
foo1():
   use tmp1, tmp2
   cal bar1():
      use tmp3, tmp4():
      cal foo2():
         use tmp1
   use tmp1 // !! And now I have garbled data in tmp1 !!


So now I really have two options:

1. Fast and dangerous - be careful with using local variables, and do manual memory management, making sure that variables and registers will not be ruined
2. Slow and more comfortable - Use hardware stack, which is so-so for A register, but quite slow for anything else as it would require lda mem, pha, pla, sta mem for every variable

Am I correct? Should I take approach with hardware stack or rather forget about it and just try to be careful and spend wonderful hours of happy debugging?

~30000 cycles per frame, this number makes me uncomfortable every time I add a new line of code.


Top
 Profile  
 
PostPosted: Fri Oct 05, 2018 12:30 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7714
Location: Seattle
Maybe use a macro to trace overlay usage and issue a warning on collision?


Top
 Profile  
 
PostPosted: Fri Oct 05, 2018 12:51 pm 
Offline

Joined: Tue Aug 28, 2018 8:54 am
Posts: 58
Location: Edmonton, Canada
I'm not sure it's sustainable. I would have to declare functions in order of execution, and if it is library function that can be called from anywhere else, it will not work.

At this point I am not looking for silver bullet, but for the main approach so I don't waste time reinventing the wheel :)


Top
 Profile  
 
PostPosted: Fri Oct 05, 2018 12:53 pm 
Offline

Joined: Sun Apr 13, 2008 11:12 am
Posts: 7714
Location: Seattle
Simplest stupidest option is give every function a non-overlapping static allocation. Don't worry about the logistics of overlays—after all, that's where you're getting tripped up.


Top
 Profile  
 
PostPosted: Fri Oct 05, 2018 2:46 pm 
Offline
User avatar

Joined: Tue Apr 04, 2017 1:22 pm
Posts: 36
Location: Ohio, USA
You could simulate some near-arbitrary stack access.
Code:
; Declare locals
.rsset 0
varFoo .rs 1
varBar .rs 1

; Init locals in reverse order
  LDA #$00
  PHA ; init varBar
  PHA ; init varFoo

; Macro to access a given variable
.macro getVar i
  TSX
.repeat i
  INX
.endrepeat
  LDA $0100, X
.endmacro

getVar varFoo
getVar varBar

_________________
http://zutanogames.com/ <-- my dev blog


Top
 Profile  
 
PostPosted: Fri Oct 05, 2018 7:31 pm 
Offline

Joined: Wed Nov 30, 2016 4:45 pm
Posts: 118
Location: Southern California
Quote:
2. Slow and more comfortable - Use hardware stack, which is so-so for A register, but quite slow for anything else as it would require lda mem, pha, pla, sta mem for every variable

You don't need to constantly be pulling stuff off the stack and putting it back on. Nor are you limited to just the byte(s) at the top of the stack. I address this in section 14 of my 6502 stacks treatise, "Local variables & environments." The following section, section 15, is on recursion. Note that these build on material laid in earlier sections.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources


Top
 Profile  
 
PostPosted: Sat Oct 06, 2018 12:37 am 
Offline

Joined: Tue Aug 28, 2018 8:54 am
Posts: 58
Location: Edmonton, Canada
lidnariq wrote:
Simplest stupidest option is give every function a non-overlapping static allocation. Don't worry about the logistics of overlays—after all, that's where you're getting tripped up.


Fair enough. It is just limiting amount of memory, that bothers me. The fear to get over the limit and then spend hours rewriting, forces me to look for other methods.

Garth wrote:
You don't need to constantly be pulling stuff off the stack and putting it back on. Nor are you limited to just the byte(s) at the top of the stack. I address this in section 14 of my 6502 stacks treatise, "Local variables & environments." The following section, section 15, is on recursion. Note that these build on material laid in earlier sections.


Thank you Garth, it was a very nice read (and I should probably read other chapters). I will play with it tomorrow.


Top
 Profile  
 
PostPosted: Sat Oct 06, 2018 11:04 pm 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 621
yaros wrote:
It is not really important is it assembler of C, because cc65 does not encourage using local stack variables anyway. I don't say I really like x86, but arbitrary stack access in 6502 would be nice... Or at least fast stack.


65816 and a SNES has you covered ;)

I though CC65 used it own stack for such things, making it even slower though?

But when one uses C on a 65(X)XX one expects it to be slow and poor. There is no free lunch, you either accept C is poor for the job, or not use it. However being in C and having a DEBUG build you can make asserts to check that things don't get trashed..

make a debug local variable
Code:
#if _DEBUG
byte testTmp1 = tmp1;
#endif
foo2();
#if _DEBUG
ASSERT(testTmp1 == tmp1, "Tmp1 trashed assign new temp");
#endif
So when you run a debug build you test to make sure weird stuff doesn't happen, and then in say a Development and Release build you don't test to get your speed and RAM back. You might need to have custom flags per "file" or section, to keep the RAM overhead under controls. Or say have the Debug build have 8K RAM expansion ( like a 'dev kit' ) that you can allocate all the extra debug variables into etc.

I kind of do this in ASM only I use a testing application to do the testing for me, rather than in my running code.


Top
 Profile  
 
PostPosted: Sun Oct 07, 2018 7:21 am 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 3135
Location: Tampere, Finland
I'll always go for the "fast and dangerous" option. Although it would be nice if there was some compiler support to make it easier.

So far what I've done is to manage everything manually and set up some Lua scripts (= no overhead) to check at runtime for violations (same variable used from more than one place). In other words, I allocate/free each local variable in procedures (Lua code keeps track which memory is used), and during allocation generate an error if an overlapping allocation of the same memory happens. Example: https://github.com/fo-fo/ngin/blob/mast ... yer.s#L242

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi


Top
 Profile  
 
PostPosted: Tue Oct 09, 2018 12:22 pm 
Offline

Joined: Tue Aug 28, 2018 8:54 am
Posts: 58
Location: Edmonton, Canada
thefox wrote:
I'll always go for the "fast and dangerous" option. Although it would be nice if there was some compiler support to make it easier.


It will work just fine, when you have allocated variables just for one function or first class subroutines. But for something like add_sprite that can be called from anywhere, it is hard to manually keep track of.

thefox wrote:
So far what I've done is to manage everything manually and set up some Lua scripts (= no overhead) to check at runtime for violations


It looks interesting, I like this idea (although I only spend couple minutes on it, and not sure how it works just yet). But I'm looking into not emulator dependent approach.


Oziphantom wrote:
I though CC65 used it own stack for such things, making it even slower though?


That's correct. cc65 uses software stack and keep head at ZP. Every local variable (or parameters that don't fit into a and x) are accessed by ld*/st* (sp),y. Approach from Garth looks way cleaner and faster with hardware stack.

Oziphantom wrote:
But when one uses C on a 65(X)XX one expects it to be slow and poor. There is no free lunch, you either accept C is poor for the job, or not use it. However being in C and having a DEBUG build you can make asserts to check that things don't get trashed..


Yeah, I just tried to port my test pong into cc65 as is, and I can't even fit it in VBlank with very little code. I'll probably go back to the ca65. My question is still relevant, as even in C you still need to hack your way for fast variables.

Oziphantom wrote:
65816 and a SNES has you covered ;)


Yeah, quick look over http://6502org.wikidot.com/software-658 ... s-on-stack feels nicer.


Top
 Profile  
 
PostPosted: Tue Oct 09, 2018 5:29 pm 
Offline
User avatar

Joined: Tue Jun 24, 2008 8:38 pm
Posts: 2035
Location: Fukuoka, Japan
I think C can be quite fast if you know how to "tame" it. You have to avoid some structure, parameters to function, uses a lot of global and know when to use asm for some tight spots.

For non intensive code like intro, title etc, I even have C call back in NMI with no issue at all so it all depends on you (ab)use it :lol:


Top
 Profile  
 
PostPosted: Tue Oct 09, 2018 5:40 pm 
Offline

Joined: Tue Aug 28, 2018 8:54 am
Posts: 58
Location: Edmonton, Canada
Banshaku wrote:
I think C can be quite fast if you know how to "tame" it. You have to avoid some structure, parameters to function, uses a lot of global and know when to use asm for some tight spots.

For non intensive code like intro, title etc, I even have C call back in NMI with no issue at all so it all depends on you (ab)use it :lol:


I have no doubt it this. It is just after trying C on NES I fell I had to cut too many corners, and I would be more comfortable with Assembler. Maybe later, when I am more comfortable with the platform, I'll try again.


Top
 Profile  
 
PostPosted: Tue Oct 09, 2018 6:50 pm 
Offline
User avatar

Joined: Fri Nov 24, 2017 2:40 pm
Posts: 85
On my last project, I did pretty well with 3 zero page globals mostly used for loops/indexing (idx, ix, iy). I rarely called functions from a loop, so it wasn't much of a problem. Most of the remaining variables were static local, and stack local were really only used in cold code like init functions and such. Yet another reason why I recommend auditing the generated assembly often. It's definitely possible to get halfway decent assembly out of cc65 if you are careful.


Top
 Profile  
 
PostPosted: Sun Oct 28, 2018 4:18 pm 
Offline

Joined: Tue Aug 28, 2018 8:54 am
Posts: 58
Location: Edmonton, Canada
Thanks everyone for the suggestions. So what I did for my project, when I don't need speed I use hardware stack in the similar way Garth suggested in his article.

There are 4 macroses:

1. stackalloc identifier, [size] - allocates size bytes on the stack
2. stackfree - deallocates all local variables
3. stackparam identifier - creates offsets for the parameters pushed on stack
4. stackcall identifier, param1, [param2...param6] - push up to 6 parameters on stack (in backwards order), call subroutine, clear up the stack after call.

Here is an example

Code:
.proc test
    stackalloc local1
    stackalloc local2
    stackparam param1
    stackparam param2
    tsx

    lda param1,x
    clc
    adc #5
    sta local1,x
    lda param2,x
    clc
    adc #5
    sta local2,x

    stackfree
    rts
.endproc

.proc main
    stackcall test, #1, #2
    jmp main


And lib code

Code:
.macro stackalloc ident, size
    .ifdef _stackParamPos
        .error "stackalloc must be called before stackparam"
    .endif
    .if .blank ({size})
        stackalloc {ident}, 1
        .exitmac
    .endif

    .ifdef _stackVarPos
        _stackVarPos .set _stackVarPos + size
        _stackVarCount .set _stackVarCount + size
    .else
        _stackVarPos .set $101
        _stackVarCount .set size
    .endif
    ident := _stackVarPos
    pha
.endmacro

.macro stackfree
    .ifndef _stackVarCount
        .error "stackalloc was not used"
    .endif
    .repeat _stackVarCount
        pla
    .endrepeat
.endmacro

.macro stackparam ident
    .ifdef _stackParamPos
        _stackParamPos .set (_stackParamPos) + 1
    .else
        .ifdef _stackVarPos
            _stackParamPos .set (_stackVarPos) + 3 ; to preserve return address
        .else
            _stackParamPos .set $103
        .endif
    .endif
    ident := _stackParamPos
.endmacro

.macro stackcall ident, p1, p2, p3, p4, p5, p6
    ; I don't know how to make .repeat with decreasing numbers
    .ifnblank p6
        pusha p6
    .endif
    .ifnblank p5
        pusha p5
    .endif
    .ifnblank p4
        pusha p4
    .endif
    .ifnblank p3
        pusha p3
    .endif
    .ifnblank p2
        pusha p2
    .endif
    .ifnblank p1
        pusha p1
    .endif
    jsr ident
    .ifnblank p6
        pla
    .endif
    .ifnblank p5
        pla
    .endif
    .ifnblank p4
        pla
    .endif
    .ifnblank p3
        pla
    .endif
    .ifnblank p2
        pla
    .endif
    .ifnblank p1
        pla
    .endif
.endmacro


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 

All times are UTC - 7 hours


Who is online

Users browsing this forum: dougeff and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group