It is currently Sun Jun 24, 2018 11:07 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 43 posts ]  Go to page Previous  1, 2, 3  Next
Author Message
PostPosted: Fri Jun 01, 2018 12:20 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 449
I just make
ZPTemp1
ZPTemp2
...
IZPTemp1
IZPTemp2
and when needed
NZPTemp1
....

I could use unions to map them to sane names but that is where the problems arise, so I comment it instead. Although on a NES you would make NZPTemp1 a lot more often than IZPTemp1.
Code:
NLL_deleteNodeX
   ldy NLLCurrentList
   lda EntityX.head,y
   sta ZPTemp3            ; the current head's value
   lda EntityX.next,x
   sta ZPTemp1            ; new next
   dec EntityX.alloc_head
   ldy EntityX.alloc_head
   txa
   sta EntityX.alloc,y    ; mark this node as free
   stx ZPTemp2            ; one to remove
   cpx ZPTemp3            ; are we removing the head
   beq _removeHead
      ldy ZPTemp3         ; start at the head
   -  lda EntityX.next,y
      cmp ZPTemp2
      beq _foundPrev
         tay
         jmp -
_removeHead
   lda ZPTemp1            ; new next
   ldy NLLCurrentList
   sta EntityX.head,y
   rts
_foundPrev
   lda ZPtemp1            ; new next
   sta EntityX.next,y
   rts   

this means I can look at the code and see what it actually hits easily. I would be easier to do a union or defines up the top, but I trust the code more than my documentation up the top.
I did make a Static Analyzer for 6502 to solve the case for when I "trash" something, but I got too fancy with my coding style and broke it. I've not been bothered enough to fix it yet, as the maintenance on it was too high. Once I make my new Dev Environment I will look at bringing it back, or merging it into MLA.
I also (although not shown above) have BDD6502 "illegal ops" that allow me to check that A,X,Y,ZP Location remains unchanged across calls. So when I get to a case that I think I'm in danger or breaking the rules, I add in the BDD ops and then let it test if I trash something when I shouldn't.
But it is rarely an issue, every now and then I will see a strange bug, for some new code I wrote and then I will check to make sure the ZPTemps are not being trashed, if so I add another and carry on.


Top
 Profile  
 
PostPosted: Fri Jun 01, 2018 2:27 am 
Offline

Joined: Thu Aug 20, 2015 3:09 am
Posts: 372
This issue seems to pop up a lot, and there are known algorithmic solutions. For example, I wrote a compiler that statically allocates all locals in zero page using a virtual stack, overlapping frames that are never live at the same time. An even better option would be interprocedural graph coloring, but that's veering into overkill territory.

My compiler's frontend is nowhere near releaseworthy, but I've been thinking of hooking the backend up to an assembler in the meantime. Would anyone here be interested in such an "optimizing" assembler? I probably shouldn't offer to write one, given my current circumstances, but it might give me the kick I need to finish this stupid thing.


Top
 Profile  
 
PostPosted: Fri Jun 01, 2018 3:44 am 
Offline
User avatar

Joined: Thu Sep 15, 2016 6:29 am
Posts: 613
Location: Denmark (PAL)
never-obsolete wrote:
rainwarrior wrote:
a 256 OAM buffer in RAM might be rebuilt every frame, which is quite a bit of temporary space available for anything that happens before you build your sprite list


I never considered this possibility. I'll have to keep this in mind if I find myself short on temporary storage.


Me neither, however 256 bytes is quite a lot of temporary storage. I'm guessing it's mostly useful for some kind of unpacking function, but even then that seems wasteful if it's going to be gone by the next frame. Hell, a 256 Zero page is already big as it is, and I always set most of that aside for temporary data.
But if your sprite "rendering" routine isn't using any ZP variables, and always takes place at the tail end of your frame logic, I guess you might as well use the zeropage for your OAM buffer though?


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 7:38 am 
Offline
User avatar

Joined: Sat Sep 07, 2013 2:59 pm
Posts: 1637
tokumaru wrote:
I don't know about C, but in assembly I have a chunk of ZP dedicated to local variables and function parameters, and I use a macro to allocate the space needed for each function. The macro can start allocating space from the very first byte of the ZP chunk, or you can optionally specify an offset into the chunk. These offsets can be specified using what I call "memory counters", which are symbols that are incremented automatically when you allocate memory using them, so if I have a bunch of nested functions, I just have to use the same memory counter for all of them and their locals will never overlap.

So, am I understanding this correctly? You're basically using a stack.

That's actually what I wanted to avoid, so that the usage of the temporary variables uses the least possible ROM space and CPU time.
Using counters before and after every function and accessing the temporary variables in the LDA Temp, X manner takes away from the performance.
In this case, I think, I would be better off by simply using non-zeropage global variables.
Using a non-zeropage global variable should be faster and smaller than setting X every time and then accessing the variable with the offset.
And, at least in my case, the non-zeropage variables should be more than enough to give each function in the whole program its own distinct variables for local usage, so that reusing isn't even necessary.

calima wrote:
I'd probably make a program that calculates every possible execution tree, then passes that to a register allocation algorithm.

This would be the optimal solution, but this would require to write your own source code parser, wouldn't it?

rainwarrior wrote:
Your code will naturally have some structured relationship between functions, and you can group your temporary usage correspondingly as well as you can fit it to the existing relationships. i.e. this group of functions will mostly use temporaries i,j,k,l and this other one will mostly use m,n,o,p...

I think this is a sub-optimal solution because of the following reason:

Within the same group, you need to make sure that variables aren't reused between functions since they call each other.
However, if two functions belong to different groups, they can share the same variable.

So, to me it doesn't make sense that the functions to process the game's characters belong to the group that use i, j, k, l while the screen buildup functions use m, n, o, p.

Because MoveCharacter and CheckCollision should not both use i since MoveCharacter calls CheckCollision.
But there's no reason why MoveCharacter and DrawBackgroundTiles should not both use the variable i since those functions never overlap each other.

So, the grouping of functions and assigning variables to each group is a little backwards because reusability comes into play specifically when two functions are not related.

Use variable i for screen buildup, character processing and hardware sprite calculation since they are unrelated to each other.
But never share variable i between the functions MoveCharacter, CheckCollision and StartMercyInvincibility because these functions call each other.

(Unless you're talking about variables that need to be consistent between function calls like the mercy invincibility counter after the player was hit. But that's something completely different anyway since we're talking about variables that are used in a purely local way, not about variables that are intended as actual global variables.)

Rahsennor wrote:
Would anyone here be interested in such an "optimizing" assembler?

I would rather be interested in something that can turn local variables into shared global variables on the source code level, so that I can keep using the same compiler and assembler for my projects as before (cc65) and so that I can easily see every instance of the optimization process myself.
However, this would require that this is also done on C code, with each local variable and function parameter.

_________________
Available now: My game "City Trouble".
Website: https://megacatstudios.com/products/city-trouble
Trailer: https://youtu.be/IYXpP59qSxA
Gameplay: https://youtu.be/Eee0yurkIW4
German Retro Gamer article: http://i67.tinypic.com/345o108.jpg


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 9:46 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10529
Location: Rio de Janeiro - Brazil
DRW wrote:
So, am I understanding this correctly? You're basically using a stack.

It's sort of like a stack, but it's processed at assembly-time by macros, so as far as the final program is concerned, it's using static memory positions like it always has.


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 9:48 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20168
Location: NE Indiana, USA (NTSC)
And the advantage of this over a traditional stack is that 8-bit ISAs of that era (6502, 8080) provide more efficient addressing modes for a statically allocated stack than for a dynamically allocated stack. The disadvantage is lack of support for (non-tail) recursion, but 8-bit games on the whole use very little if any recursion.


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 9:57 am 
Offline
User avatar

Joined: Sat Sep 07, 2013 2:59 pm
Posts: 1637
tokumaru wrote:
It's sort of like a stack, but it's processed at assembly-time by macros, so as far as the final program is concerned, it's using static memory positions like it always has.

A compile/assembly time stack implemented with macro tricks would be perfect. But how does that work?
How do you determine the counter/index/offset of the current variable at compile/assembly time if you don't know when this function will be called?

I.e. if the main function uses three zeropage variables as locals and calls the function ABC, how does the Assembler know that the first used variable in function ABC should be mapped to $0004? After all, ABC could be called at any point in the program, for example by a function who uses six variables and who itself is called by another function who uses two variables, so this time the local variable in ABC would have to be at least at address $0009.

Can you please post an example?

_________________
Available now: My game "City Trouble".
Website: https://megacatstudios.com/products/city-trouble
Trailer: https://youtu.be/IYXpP59qSxA
Gameplay: https://youtu.be/Eee0yurkIW4
German Retro Gamer article: http://i67.tinypic.com/345o108.jpg


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 10:28 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6354
Location: Canada
DRW wrote:
rainwarrior wrote:
Your code will naturally have some structured relationship between functions, and you can group your temporary usage correspondingly as well as you can fit it to the existing relationships. i.e. this group of functions will mostly use temporaries i,j,k,l and this other one will mostly use m,n,o,p...

I think this is a sub-optimal solution because of the following reason:

Within the same group, you need to make sure that variables aren't reused between functions since they call each other.
However, if two functions belong to different groups, they can share the same variable.

So, to me it doesn't make sense that the functions to process the game's characters belong to the group that use i, j, k, l while the screen buildup functions use m, n, o, p.

Because MoveCharacter and CheckCollision should not both use i since MoveCharacter calls CheckCollision.
But there's no reason why MoveCharacter and DrawBackgroundTiles should not both use the variable i since those functions never overlap each other.

So, the grouping of functions and assigning variables to each group is a little backwards because reusability comes into play specifically when two functions are not related.

Use variable i for screen buildup, character processing and hardware sprite calculation since they are unrelated to each other.
But never share variable i between the functions MoveCharacter, CheckCollision and StartMercyInvincibility because these functions call each other.

I don't think you understand my suggestion. I was suggesting that you use groupings by temp variable needs to manage the temp variables. You will already have other groupings for other purposes (player code, collision code, sprite code, etc.) but that's a different and separate grouping, though it will undoubtedly be strongly related. You would probably be able to divide this management problem into independent sub-problems by that kind of type if they don't share the stack, but the groupings by variable would be in this sub-problem domain, right? There's not really a lot I can say here in the abstract while we're not talking about a specific code base with a specific set of problems to solve.

Like your suggestion was to fragment your variable groups sort of by stack depth, or some other global grouping. My feeling is that this kind of pre-fabricated structure will become an imposition; whatever size you make the layer bucket, most functions will underuse the bucket, and others will overflow it causing you to have to manage it in some other way. Doing this adds a global structure that will cross cut right through your code and have a big impact on it. There's a give and take there. This structure will provide some benefit, and some detriment, but...

On a large scale project, you can only predict some things ahead of time; a lot of stuff will eventually have to move and change. Having to maintain a pre-determined structure like that can make it harder. Agility is useful. One suggestion that was made above that I would strongly affirm is to use local aliases for the temporary variables in each function that uses them. If you declare them all at the start of the function, this has two strong effects: 1. all temporary usage is documented by that declaration, 2. the temporaries are very easy to reassign later. (ca65's .proc scoping is great for this.)

That's the main thing I was saying: structure is not really well known before you start, so I recommend trying to write in a way that is pliable and optimizable, and accept inefficiency in service to that goal, as least at first. Find out roughly what you need and how it hangs together, and then cut it back to its essential form once you know more about it. Predicting this stuff is hard, so I wouldn't want to impose some global code layering scheme before I knew a lot more about the actual shape of my problems. If I can I'd rather implement something like that after things start to look like they have those layers, a later optimization step.


Having automated tools for it would give you tons of agility with temporary usage (in the form of not having to manage it at all), but probably only if the tool itself doesn't become a surrogate maintenance problem. :P That's sort of an advantage of high level languages in general, though.


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 12:10 pm 
Offline
User avatar

Joined: Sat Sep 07, 2013 2:59 pm
Posts: 1637
rainwarrior wrote:
I don't think you understand my suggestion. I was suggesting that you use groupings by temp variable needs to manage the temp variables.

I'm not sure if I already understand you.
You mean I should put CheckCharacterCollision and UpdateBackgroundPalette into the same group because they can share their variables?

rainwarrior wrote:
On a large scale project, you can only predict some things ahead of time; a lot of stuff will eventually have to move and change. Having to maintain a pre-determined structure like that can make it harder. Agility is useful.

I agree with agility, that's why I'm checking if there's a better way than my layer attempt.
Some compile-time way of finding out how many of the temporary variables can be in use during the call of any given function, so that each temporary variables of every function always start at the highest necessary value would be the best.

By the way, of course I only use my system when the general structure stands.
Which is basically the case in the moment since the current status of my game has screen buildup, cutscenes, dialogs and almost all action gameplay stuff. It's not that I'm just starting.

Until now, I still used plain local variables. But once optimization kicks in, I want to replace all of them with those temporary variables section by section. Starting with the ProcessCharacter functions because they're the action functions and I need to see how many opponents I can have on screen before the game starts to lag.

In fact, the ProcessCharacter functions are the only ones that are really time-critical. Menu functions will never lag. And screen buildup in a screen-by-screen scrolling game takes a few frames anyway, so a lag doesn't matter here.
So, actually, I could use plain local (or static local non-zeropage) variables in every place except for one file.
But unfortunately, it's not only the CPU time, but also the ROM space that gets wasted with local and non-zeropage variables.

rainwarrior wrote:
One suggestion that was made above that I would strongly affirm is to use local aliases for the temporary variables in each function that uses them.

Well, this is of course always a given. I will not work with the temporary names itself, I'm not crazy. Of course they will be renamed via #define.

rainwarrior wrote:
Having automated tools for it would give you tons of agility with temporary usage (in the form of not having to manage it at all), but probably only if the tool itself doesn't become a surrogate maintenance problem. :P That's sort of an advantage of high level languages in general, though.

Yeah, I would need a tool that can parse the whole C source code and that then decided at which inner stack level a function gets called from other functions, so that the tool can output a C file where the local variables are replaced with the highest necessary temp variable.

Since I don't know such a tool and I don't want to manage this by hand, I invented the layering structure: Now, you don't have to restructure the variable usage whenever a function gets a new variable. But you only need to change it when a function A needs to call function B, and only if function B is on a higher or equal temp layer than function A.

_________________
Available now: My game "City Trouble".
Website: https://megacatstudios.com/products/city-trouble
Trailer: https://youtu.be/IYXpP59qSxA
Gameplay: https://youtu.be/Eee0yurkIW4
German Retro Gamer article: http://i67.tinypic.com/345o108.jpg


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 12:23 pm 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 3079
Location: Tampere, Finland
Rahsennor wrote:
Would anyone here be interested in such an "optimizing" assembler? I probably shouldn't offer to write one, given my current circumstances, but it might give me the kick I need to finish this stupid thing.

I'm interested out of curiosity, i.e., I'd like to see how you'd approach it (I've been meaning to make something similar for a long time), but I think it's hard to get people to adopt things like this in practice.

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 12:51 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 6354
Location: Canada
DRW wrote:
You mean I should put CheckCharacterCollision and UpdateBackgroundPalette into the same group because they can share their variables?

Possibly, if that seems like a good fit, but it's hard to talk in the abstract like this. If you're saying these are unrelated functions that never share a stack, then you might have even have a separate set of "layers" for one stack group and another. All I'm really saying I would normally try to make the organization follow the productive goals of the code, rather than the other way around.

DRW wrote:
rainwarrior wrote:
One suggestion that was made above that I would strongly affirm is to use local aliases for the temporary variables in each function that uses them.

Well, this is of course always a given. I will not work with the temporary names itself, I'm not crazy. Of course they will be renamed via #define.

Well, perhaps I'm crazy, but I just did a whole game without any local temporary aliases. However, sort of what I said earlier on in this thread, I didn't actually find managing temporaries to be a significant problem, partly because I see using them efficiently as an only occasionally needed optimization. I did use ZP temps constantly, but in most cases didn't have to spend much thought on allocating them. Reallocating/managing them was an infrequent small nuisance at worst, for me. I'm already doing it with aliases in my next project though, and it feels more sensibly organized (without having added any additional work). I guess I had pretty good habits about documenting temp usage in the code as I worked, but the aliases make it almost impossible to forget.


I forgot to mention in my list of inefficient but easier to manage ways to do local variables: instead of using variables directly on the stack, you could just to push some existing temporaries to the stack to make room, and then use those temporaries. Gotta remember to pop them too, though. May or may not be more efficient than using the stack directly, given the situation, but not having to maintain the stack pointer in X usually felt like a big advantage because aside from the push/pop at the start and end, the code was otherwise freely written using temporaries like the rest of the code, and trying to optimize/reallocate later on is much lower impact. I.e. if you replace a stack access with a ZP temp, the extra usage of X you just freed up might require rewriting a lot more of the code around it to take advantage of. It just felt like an easier "default" way to do it, and stack addressing was a more special purpose tool.


Last edited by rainwarrior on Sat Jun 02, 2018 12:53 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 12:52 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10529
Location: Rio de Janeiro - Brazil
DRW wrote:
A compile/assembly time stack implemented with macro tricks would be perfect. But how does that work?
How do you determine the counter/index/offset of the current variable at compile/assembly time if you don't know when this function will be called?

My current approach is to use a macro to start the local declarations, and if this macro is given the name of a memory counter, it adds that memory counter to the base address so that variables start at that offset (if the counter doesn't exist yet, it's initialized with a value of 0). Another macro ends the declarations, and updates the counter, so that the next function to use the same counter will then use memory that comes after the chunk that this one used.

This means that the call order isn't important, the memory will be allocated in the order in which the functions are assembled, but the end result is effectively the same, functions that run concurrently won't have overlapping local variables, add long as you use the same memory counter for each group of concurrent functions.

Quote:
Can you please post an example?

My macros are a bit dependent on a whole system of macros I've built for myself, so I'd have to post a whole bunch of unrelated stuff in order to show a cohesive example, so I'd rather not. The basic idea is pretty simple though, just specify which group of concurrent functions each function belongs to (if any) when declaring their local variables, so that the memory offset for each group is used and updated correctly for each function.


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 1:09 pm 
Offline
User avatar

Joined: Sat Sep 07, 2013 2:59 pm
Posts: 1637
tokumaru wrote:
My current approach is to use a macro to start the local declarations, and if this macro is given the name of a memory counter, it adds that memory counter to the base address so that variables start at that offset (if the counter doesn't exist yet, it's initialized with a value of 0). Another macro ends the declarations, and updates the counter, so that the next function to use the same counter will then use memory that comes after the chunk that this one used.

O.k., I guess I see what you mean:

It's like having overlapping segments in a cc65 cfg file, right? And you simply ensure that all the functions that might call each other use the same segment while unrelated functions use another segment that occupies the same memory location.


Would it actually be possible in a cfg file to declare two RAM segments that occupy the same memory location?

_________________
Available now: My game "City Trouble".
Website: https://megacatstudios.com/products/city-trouble
Trailer: https://youtu.be/IYXpP59qSxA
Gameplay: https://youtu.be/Eee0yurkIW4
German Retro Gamer article: http://i67.tinypic.com/345o108.jpg


Top
 Profile  
 
PostPosted: Sat Jun 02, 2018 11:24 pm 
Offline

Joined: Thu Aug 20, 2015 3:09 am
Posts: 372
thefox wrote:
I'm interested out of curiosity, i.e., I'd like to see how you'd approach it (I've been meaning to make something similar for a long time), but I think it's hard to get people to adopt things like this in practice.

Well, the only hard requirements are that the call graph not contain recursion and the allocator be able to completely trace it. Wrap every function in proc/endproc, all inter-procedural control flow must target a proc, no recursion. Boom, lexically-scoped locals.

Jump tables and indirection are a pain though. I haven't bothered dealing with them yet because my compiler doesn't generate them, but that's not going to fly in an assembler. I came up with some high-level 'macroinstructions' to sweep the fiddly bits under the rug, but they're not really appropriate for assembler either.

And you're right that nobody's likely to use it. A HLL with the same features would be more popular, since there isn't one yet, but we've got too many assemblers already. We don't need another one.


Top
 Profile  
 
PostPosted: Sun Jun 03, 2018 5:49 am 
Offline
User avatar

Joined: Sat Sep 07, 2013 2:59 pm
Posts: 1637
Rahsennor wrote:
A HLL with the same features would be more popular, since there isn't one yet, but we've got too many assemblers already. We don't need another one.

In fact, we don't need another high level language either. For me, I can say that I wouldn't switch to another programming language just for one singular feature.

To me, the best solution would be a tool that can convert source code into the same source code, but with the features.

Switching local variables into temporary global variables:
Code:
void Function1(byte a, byte b)
{
    byte c;

    c = a + b;

    ...
}

void Function2(void)
{
    byte x = 25;

    Function1(x, 87);
}

This becomes:
Code:
void Function1(void)
{
    Temp3 = Temp1 + Temp2;

    ...
}

void Function2(void)
{
    Temp4 = 25;

    Temp1 = Temp4
    Temp2 = 87;
    Function1();
}

The only thing I'm asking myself: How should this work in pure Assembly code?

In C, it should be clear that if the programmer declares a local variable as static, that this one does not get turned into a temp variable.
But how do you distinguish in Assembly whether a .res is supposed to represent a local variable or whether it's a global variable or a local static variable where the value needs to be kept between two function calls?

_________________
Available now: My game "City Trouble".
Website: https://megacatstudios.com/products/city-trouble
Trailer: https://youtu.be/IYXpP59qSxA
Gameplay: https://youtu.be/Eee0yurkIW4
German Retro Gamer article: http://i67.tinypic.com/345o108.jpg


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 43 posts ]  Go to page Previous  1, 2, 3  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group