Since my game requires quite a bit of variables, it wouldn't be enough to declare distinct global variables for each function that I only use inside this function. (Or in case of C: Declaring local static variables.)
So, if I neither want to use local variables on the stack, nor global non-zeropage variables, I have to find a way to reuse some zeropage variables.
My currently planned attempt is this:
I declare a bunch of general-purpose variables, but in several "layers" or "levels": A, B, C etc.:
If a function needs local variables and it either doesn't call any other function or it only calls other functions that don't need local variables, this function would use the A variables. Also, the name of the function would get a postfix:
--> uses A variables
If a function calls another function that already has the A postfix, then this new function uses the B variables and gets a B postfix:
--> uses B variables
--> calls ProcessHero_vA
If a function doesn't use local variables itself, but calls other functions that do, then this function of course always gets the highest postfix as well, because those temporary variables are used inside this function via the other functions:
--> uses no variables
--> calls ProcessAllCharacters_vB
--> Therefore, ProcessGameLoop needs to get the _vB postfix because the B variables are in use while ProcessGameLoop is running (via ProcessAllCharacters), so the function that calls ProcessGameLoop couldn't use the B variables itself, but would have to use the C variables.
In the actual code, I would of course rename the variables. I wouldn't work with AByte1 etc. directly:
#define currentDirection_ AByte1
(The name gets an underscore as a postfix, so that there cannot be any name clash or compiler issue if you happen to have an actual local static variable named currentDirection in your code. Since #define would replace any instance of its name without regard for context.
In Assembly itself, the underscore wouldn't be necessary because this:
currentDirection: .res 1
currentDirection = AByte1
would be a compiler error anyway.)
The layer system is done, so that functions who call each other cannot accidentally overwrite each others' values:
If you have a function DoSomething_vA and you suddenly find out that this function needs to call another function that also has the _vA postfix, then you have to rename DoSomething_vA to DoSomething_vB and use the B values instead of the A values:
#define currentDirection_ AByte1
#define currentDirection_ BByte1
Then you can let the compiler find all instances of the old name DoSomething_vA, so that you can rename these as well as and any other function that calls DoSomething to reflect the new layer of variables.
So, that's my attempt to use zeropage variables as reusable local variables.
Do you know of any alternate way that might be better?
This works well for most cases, but you still have to be careful if nested functions are called from multiple places. To completely avoid overlaps in this case, you'd probably have to make ALL functions that can possibly be called before the deepest one use the same memory counter, and if there are a lot of them, you might end up wasting a lot of ZP space for this.
I still haven't found the "ideal" solution for this, but what I have now allows for decent automatic reuse of RAM for locals and parameters, and also for program modules that don't run concurrently.
I think I'm sloppy on subroutine use. The only times I use subroutines is either, I am trying to loop over a huge chunk of code, or I just happen to find duplicated code in my game. It's hard to know what type of stuff I need to reuse.
Sounds like a lot of work, but that should produce the optimal results.calima wrote:I'd probably make a program that calculates every possible execution tree, then passes that to a register allocation algorithm.
To be fair, you're an SNES programmer, so you don't have to worry as much about RAM as us NES guys.psycopathicteen wrote:I've been programming in ASM for a long time, and I don't even know how I do it myself.
The problem is that doing this manually is very tedious and error-prone. You may need to add a variable or a parameter to a function you coded long ago and not realize you have to change all functions that that function calls, and the ones that those call, and so on. It's maintenance hell.I think if I write the outer routine first, the inner routine uses whatever temporary variables are left, and vise-versa.
My solution was basically to share a memory counter (managed by macros) between all functions that can run concurrently, so that overlaps are completely avoided among them, but things can get weird when the same functions are used in multiple nesting scenarios, so functions that don't run concurrently have to share a memory counter because they all use one or more common functions. Thankfully that hasn't happened a lot to me yet. If things got really ugly I'd even consider duplicating the common functions to reduce the amount of functions using the same memory counter.
Like you can propose some pre-emptive layered structure like DRW has here, but I think the more generic (and maybe more applicable idea) is just that your temporary usage can have groups. Your code will naturally have some structured relationship between functions, and you can group your temporary usage correspondingly as well as you can fit it to the existing relationships. i.e. this group of functions will mostly use temporaries i,j,k,l and this other one will mostly use m,n,o,p...
There's back and forth here, but when considering creating new structures to group your code, also try to find ones that already exist inherently.
...and it could of course be automated, and you could create code analysis tools and call graphs etc. but the scale of the problem never seem to fit the scale of that solution, to me. YMMV.
I don't do much 6502 at the moment, so I haven't gotten a chance to try it out, but an idea I like (inspired by many of the comments on that thread) is to just use the first n bytes of zero page something like the standard MIPS calling convention suggests for its registers, with some tweaks for 6502 suitability. Feel free to use some other standard ISA's calling convention as well. The big thing is to reserve some number of bytes as temporaries that aren't preserved across subroutine calls, some number of bytes as saved temporaries that are, some number of bytes as kernel temporaries (for ISR's, so they don't need to save anything besides registers), designated bytes for function parameters (that don't fit in A, X, and Y), function return values (say, pointers that don't just fit in A), and a pointer for a virtual stack to supplement the hardware one.
The approach you're outlining in your post sort of resembles register windows, so if you want to go down that road it might be worth looking into the SPARC calling convention as well.
Code: Select all
a0 - a5 : argument registers t0 - t9 : temporary registers s0 - s5 : save registers v0 - v1 : return value registers (or more arguments) i0 - i1 : temporary registers (for interrupts)
S must be pushed to the stack before being used.
I are only used in interrupt handlers.
Then I assign some local name one of those aliases:
Code: Select all
; ====================================== __loop_var = s0 __src_addr = a0 __special_val = v0 SomeFunction: preserve __loop_var ldx #(SIZE - 1) - stx __loop_var lda table_hi, X sta __src_addr + 1 lda table_lo, X sta __src_addr jsr SomeOtherFunc jsr BlahBlah jsr FuncThatReturnsAValue lda __special_val jsr DoSomethingWithIt ldx __loop_var dex bpl - restore __loop_var rts ; ====================================== __dest_addr = t0 SomeOtherFunc: setpointer __dest_addr #SomeRamAddress ldy #0 - lda (__src_addr), Y sta (__dest_addr), Y iny bne - rts
With that in mind, this desire to manage temporaries on the ZP is merely one possible optimization. If you find you have conflicts to resolve, and for some reason it seems really painful to manage your ZP temporaries, there's generally always an option to do something "easier" that's only a little less efficient. If you can't find an allocation in your existing temporaries, you might put the variable on the stack, or maybe somewhere in RAM, or even just add one more ZP temporary, presuming you haven't budgeted 100% of it yet. Once you relax the requirement that it go into one of your already-allocated ZP temporaries there are really a lot of possibilities; for example consider that a 256 OAM buffer in RAM might be rebuilt every frame, which is quite a bit of temporary space available for anything that happens before you build your sprite list.
There's always many ways to do this, and the constraints aren't nearly as pressing when you don't insist on always doing it the fastest way. Spending a few more cycles using RAM instead of ZP may well be worth avoiding some organizational headache.
I never considered this possibility. I'll have to keep this in mind if I find myself short on temporary storage.rainwarrior wrote:a 256 OAM buffer in RAM might be rebuilt every frame, which is quite a bit of temporary space available for anything that happens before you build your sprite list
Wow, I didn't think about that either.never-obsolete wrote:I never considered this possibility. I'll have to keep this in mind if I find myself short on temporary storage.rainwarrior wrote:a 256 OAM buffer in RAM might be rebuilt every frame, which is quite a bit of temporary space available for anything that happens before you build your sprite list