Function params & local vars using a stack in ASM6

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Function params & local vars using a stack in ASM6

Post by tepples »

rainwarrior wrote:My metasprite code has only 4 global temporaries. 1 each for the base X/Y position, 1 for the tile count, and a pointer to the sprite data.
My draw routine takes six arguments:
  • X position (16-bit)
  • Y position (16-bit)
  • Window (8-bit, a value of $00, $40, $80, or $C0 added to each sprite tile number based on which window in CHR RAM is associated with an actor)
  • Attribute (8-bit, used for flipping, priority trick, and palette swapping)
  • Sprite sheet number (8-bit, which actor type's pointer table to use)
  • Frame number (8-bit, added to the sprite sheet's base pointer)
From there, it claims seven more bytes for five more temporary variables:
  • Data pointer (16-bit; looked up using sprite sheet and frame number)
  • X position of current horizontal strip (16-bit)
  • Y position of current horizontal strip (8-bit)
  • Palette of current horizontal strip (8-bit; used for split-palette or overlay colors)
  • Number of 8x16 pieces remaining in the current horizontal strip
rainwarrior wrote:
tepples wrote:I concede that player movement and metasprite drawing can usefully "clobber all temporaries"...
Well, in my code every character has a separate "tick" and "draw" routine
Same here. The tick routine is type-specific, but draw is generic.
rainwarrior wrote:the concerns are separated: the tick code won't draw sprites, the draw code will never move the character.
Same here. Tick may cause other characters or bullets to be spawned, and spawning uses quite a few temporaries, but it won't actually draw anything.
rainwarrior wrote:
tepples wrote:...except the one reserved for iterating through the actor list.
!? I'm very surprised that you would want to use a global temporary for that and not a dedicated variable.
It is a dedicated variable. It's just considered "temporary" because its lifetime doesn't extend outside the loops that call tick and draw, as opposed to variables that persist to the next frame. A static local manager would put it after temporaries but before the game state.
rainwarrior wrote:Could also be a factor that I prototype most things in C++, so I don't end up having to do a lot of iteration in actual NES code.
The "running demake" method is another way to go about it. I just have to figure out how to convince backers of this, some of whom I anticipate to have the mentality "SHOW IT ON NES RIGHT NYEAAOUW OR ITS TEH FAKE".
rainwarrior wrote:I understand the problem of reusing memory for different game phases, but I view that as a separate issue from managing static local temporaries. Any stuff that would persist longer than a frame I don't imagine encapsulating as a function local
You could, for example, think of the flood fill scratchpad as an instance of class FloodFillJob local to the function run_build_phase(). It needs to persist because it spreads the work over several frames so as not to cause the other player's piece placement to lag. But I see no theoretical difference between a local variable lasting 30 million cycles on a 1.8 MHz CPU and one lasting 30 million cycles on a 1.8 GHz CPU, even though one is part of the game state and the other is probably a temporary value used and discarded within a frame.
User avatar
dustmop
Posts: 136
Joined: Wed Oct 16, 2013 7:55 am

Re: Function params & local vars using a stack in ASM6

Post by dustmop »

rainwarrior wrote:
dustmop wrote:I have implemented this technique in co2, a scheme-based compiler that I took over for use in an upcoming project. See the file casla.scm for the code. I can imagine an assembler doing something similar, allocating memory locations at link time, though it would need to be aware of jump tables and similar stack manipulating control structures. Possibly even cc65 could get support, assuming someone wanted to add it.
Have had some time to look at this example, and it's interesting to see.

With the very functional style scheme code you're writing, I can see how the call stack quickly builds and local temporaries are a rapidly mounting problem. It really does demonstrate why you had a need to write a system to manage them as statics.
A lot of the example co2 code in the repo was not written by me, but a previous maintainer. Though I generally enjoy a functional style, I personally don't think it's necessarily the best fit for NES work, because of the lack of GC or even convenient memory allocation. Side-effects all the way!

But anyway, yes, I've run into the question of how to handle temporaries / locals / parameters in every project I've made, and as they get more complex the issue continually gets worse. Especially as call graphs become taller and more criss-crossed. Though I believe you that you're genuinely curious, I can't explain why you haven't experienced this problem before. I guess tepples and I just have a different perspective.

For a better example of where such a tool would have come in handy, have a look at Filthy Kitchen's source.

sprite_space.asm
draw_picture.asm
player.asm
intro_outro.asm

I define a buffer of 16 bytes called values, then inside each module, individual bytes are redefined with more readable names.

For example, sprite_space is a leaf module that handles OAM cycling, and it defines max_idx, first_idx, and second_idx to use values 4, 5, and 6, respectively. It gets called by draw_picture, which only uses values 0 and 1. That in turn is called by player which uses 1, 2, 7, and 8. Reusing 1 is safe because it's only clobbered during a tail call to DrawPicture, but it sure would be nice to have a tool prove that instead of relying on manual inspection that there's no bugs. Player is top level, but anything else that uses DrawPicture is in a similar situation.

The real headache comes when adding memory needs to lower-level calls. For example, second_idx was added to SpriteSpace pretty late in development. I had to push a bunch of values around higher up the stack, it was very tricky and led to some bugs that were hard to track down. In fact, keeping track of bugs that I've run into on recent projects, about 60-70% were due to conflicts of this nature.

Finally, intro_outro uses values 4 through 8. Why can't it use 0 to 3? I haven't the faintest idea anymore, and the only way to find out would be to look at every dependency and see what they each require. Personally I'd rather focus on other areas of development.

You may be over-estimating the difficulty of creating the tool I'm describing. The linked algorithm in scheme is only like 50 lines of code. While making Filthy Kitchen, I had started work on a python tool that would act like a "linter", to analyze the call graph and values reservations to find any conflicts. I didn't finish it, as I felt I didn't understand the problem well enough yet, but writing this integrated scheme version has helped a lot in that area.
User avatar
rainwarrior
Posts: 8733
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Function params & local vars using a stack in ASM6

Post by rainwarrior »

Thanks for the examples.
dustmop wrote:You may be over-estimating the difficulty of creating the tool I'm describing.
I was mostly thinking about implementing it as part of CC65. I've occasionally played with the CC65 source hoping to add some small feature/tweak, but so far every time I've turned back and decided to live without whatever it was I wanted, even after "mostly" implementing it. (I'm not generally shy about jumping into open source projects, but CC65's code has a particularly steep learning curve, IMO.)

The actual allocation part of it seems pretty straightforward. It's all the plumbing that goes around it, concessions for special cases (and finding/testing for them), ability to override it when needed, etc.

It would probably be a lot easier in a simpler compiler, or one that was my own. Like part of my last job was actually maintaining a compiler, and it was hairy but I was in that code every day and knew my way around it. Not so with CC65. I'm not keen to write or maintain a compiler again any time soon though. :P

It would be a good improvement to --static-locals, though (addresses the main reason I wouldn't use it), and I'm sure its underlying assembly feature would be appreciated by some.
Rahsennor
Posts: 479
Joined: Thu Aug 20, 2015 3:09 am

Re: Function params & local vars using a stack in ASM6

Post by Rahsennor »

I wrote my own compiler from scratch when I determined my "if in doubt, rewrite" coding style was completely unmaintainable in 6502 assembler. It uses the same allocation strategy as dustmop describes, plus a few special-case optimizers for things like converting live ranges to PHA/PLA pairs where possible.

As a proof of concept, I used it for everything in Wrecking Balls. I'm both proud of the fact that it worked and embarassed that it took me less time to write a compiler and make a game with it than it would have to write the final product in ASM. :roll:
User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Re: Function params & local vars using a stack in ASM6

Post by thefox »

dustmop wrote:I've been exploring an idea that improves upon these approaches. Have a tool analyze each function and track all the other functions that it calls. After compilation is done, build a call-graph of the entire program. Starting from the leaves, count how much RAM space each function needs for locals and parameters, and propagate that value upwards to callers. Using this, allocate memory addresses to the function based upon the needs of its locals plus the maximum of its callees. For example:
Do you plan to handle function pointers in some way? I guess the most efficient (yet somewhat laborious) way would be to use manual annotations. For example, a function call to an entity handler through a pointer based on the entity type should be annotated with information about all the possible entity handlers that might get called.

(Alternatively, the hypothetical language could limit how dynamic calls are handled. E.g., only allowed through a compile-time constant table of functions, in which case the compiler would know how to handle it.)
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
User avatar
dustmop
Posts: 136
Joined: Wed Oct 16, 2013 7:55 am

Re: Function params & local vars using a stack in ASM6

Post by dustmop »

Function pointers are a big hurdle, and yet another reason why it would be hard to add this to C (along with rainwarrior's correct observation that the codebase is quite difficult).

For co2, jump tables are the preferred method. They're not implemented yet, but can be simulated with a large cond statement. For example this dispatch can instead be handled efficiently using a jump table, without breaking the call-graph analysis.

For the general case of function pointers, I believe manual annotation is probably the wrong approach, since it both takes a lot of work and reintroduces the possibility of mistakes. Instead, the compiler could record every capture of a function's address, and treat every dereference of a function pointer as jumping to the union of those captured functions. At worst, this wastes a little bit of RAM space (only if there's two or more distinct function pointers that don't share destination sets) but never leads to a loss of correctness. Type information on function pointers could tighten this up (another thing C isn't great at). As long as there's no intervening manipulation of the bytes of a function pointer, everything works out, and if there is, well that's undefined behavior anyway.
Post Reply