It is currently Sun Aug 25, 2019 8:26 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Sat Aug 03, 2019 7:34 am 
Online

Joined: Sun Jun 30, 2013 7:59 am
Posts: 41
Hi there. I was curious whether many here "imitate" function calls using a calling convention in conjuction with the stack?

I had the impression for the longest time that, for variables used in functions, fixed locations would be put aside for temporary variables and these would be used and overwritten as needed; the burden of ensuring things are where they need to be at the right time and preserved when they need to be is on the programmer as each function doesn't really have its own "frame" so to speak.

However, I noticed some time ago (sorry, can't remember where) someone's code on here seemed to be using the sort of calling convention you'd see in, for example, the C language. Function arguments are pushed onto the stack, said function is jumped to, uses the arguments by indexing into the stack, and then "returns" the result by putting it in a fixed location, or leaving it on the stack, or something like that. Each function expects arguments to be at certain offsets from (its) base stack pointer, and each function ensures that everything is "as it was" before it was called when it returns - that is, other than leaving the result in an expected place as mentioned prior.

Just wanted to get an idea of how many of you guys do this sort of thing, and how viable it is? From what I remember, the 6502 doesn't have particularly good stack accessing opcodes, so I figure that might have an impact to some degree.

Thanks :)


Top
 Profile  
 
PostPosted: Sat Aug 03, 2019 8:52 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11403
Location: Rio de Janeiro - Brazil
Not many people do it, at least not globally, I suspect it's because of the awkwardness/slowness of having to deal with the return address being on top of the arguments on the stack, so you have to either manually remove the arguments after returning or manipulate the stack and move the return address down before returning.

When you consider that not many games benefit from the use of recursion, it makes little sense to go through the trouble of making heavy use of the stack on every single function call.

Since you have to use X anyway in order to access arguments on the stack, maybe implementing a software stack for arguments and return values would work better than using the hardware stack for everything, but that'd still be way slower than simply using absolute/ZP addressing.

I think most people reserve a chunk of ZP memory for arguments, return values and local storage, and allocate that statically so that functions that are called from other functions don't overwrite any memory used by functions that are lower in the call stack. Allocating that memory automatically can be tricky, but it's possible - in the past I've used labels to mark the end of the local storage of each function, and a macro to define the starting address of a function's local storage that takes one or more end addresses and picks the highest one, so I didn't have to handle everything manually and worry about breaking things when modifying functions.


Top
 Profile  
 
PostPosted: Sat Aug 03, 2019 10:04 am 
Offline
User avatar

Joined: Fri May 08, 2015 7:17 pm
Posts: 2550
Location: DIGDUG
You can manage your own stack (different from the hardware stack).

If you use an (indirect), y stack pointer, you could use y as the index to the stack.

I think, passing 3 arguments to a function would work like...

(stack starts at $7ff)
JSR dec_stack_3
LDY #1
LDA first arg
STA (stack), y
INY
LDA second arg
STA (stack), y
INY
LDA third arg
STA (stack), y

JSR function

and end that function with a jump to inc_stack_ 3

and inside that function, it would
LDY #1
LDA (stack), y
to get the first arg.

But, this is a bit slow.

_________________
nesdoug.com -- blog/tutorial on programming for the NES


Top
 Profile  
 
PostPosted: Sat Aug 03, 2019 1:25 pm 
Offline

Joined: Wed Nov 30, 2016 4:45 pm
Posts: 146
Location: Southern California
These are discussed in my treatise on 6502 stacks (plural, not just the page-1 hardware stack), particularly section 6 on passing parameters and section 14 on local variables and environments. Edit: It shows how to do what rainwarrior mentions below in section 7, on inlining data. It's mostly about inlining constants. Somewhere I have more about inlining variables too (if the program is in RAM); I'll just have to remember where it is.

_________________
http://WilsonMinesCo.com/ lots of 6502 resources


Last edited by Garth on Sat Aug 03, 2019 2:59 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Sat Aug 03, 2019 2:25 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 7568
Location: Canada
In general I don't want to use the stack to pass arguments (registers or ZP is usually preferred), for reasons of speed or code size.

One alternative I find a little bit compelling, which applies only to function calls with constant parameters, is putting the parameters as data directly after the call. This sacrifices a little bit of speed, but it's hard to beat for code size.
Code:
jsr function
.byte arg0, arg1, arg2
; function manipulates the stack to return here

; can make a macro to make it easy to call in 1 line
.macro FUNCTION arg0, arg1, arg2
jsr function
.byte arg0, arg1, arg2
.endmacro

; a call now looks like this:
FUNCTION 5, 20, 5


Even without this technique, this is somewhere I do often use macros, just to break down repetitive argument setup code and just be able to write a one line statement that looks like a function call. (And if you're using the macro and you want to change the function's calling conventions later, it might save a lot of work too since you can just change the macro.)


Top
 Profile  
 
PostPosted: Sat Aug 03, 2019 10:44 pm 
Offline
User avatar

Joined: Thu Mar 31, 2016 11:15 am
Posts: 525
For most action games it doesn't make sense to use a stack. If you're doing something that could have very deep subroutine calls (like, an RPG) then it can make sense.


Top
 Profile  
 
PostPosted: Sat Aug 03, 2019 11:43 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 11403
Location: Rio de Janeiro - Brazil
Why would you need very deep subroutine calls in an RPG? Whatever the reason is, there's nothing preventing you from using several levels of subroutines without a stack, you just can't have more than one instance of each subroutine.


Top
 Profile  
 
PostPosted: Sun Aug 04, 2019 8:45 am 
Offline
User avatar

Joined: Fri Nov 12, 2004 2:49 pm
Posts: 7739
Location: Chexbres, VD, Switzerland
Quote:
Do you/does it make sense to imitate function calls?

On a system with a 6502 processor, no it doesn't make sense, because doing so would waste a lot of ROM and CPU cycles for not much beneficial gain. Using the hardware stack would be tremendously slow and complex.
  • Pushing paramaters with pha is the only decently efficient operation, however this requires doing it even if the paramters are just a literal value.
  • Retriving them requires tsx, and lda $101,X or similar. This steals one index register that could be useful for something else, and is slow because all acesses are 4 cycles
  • After returning you need to pull the arguments, which is a pure waste of time.

Doing it with a software stack outside page zero leads to similar problems - it steals a register constantly and is slow. Doing a software stack in page zero would be the only option to make sense, and the lda (zp,X) adressing mode would be useful when passing pointers on the stack. It's roughtly the same speed as using fixed zero-page locations, but still steals a registers for this usage alone, and you need to waste time adjusting the software-stack pointer.

Another approach, which is semi-efficient is to use the hardware stack (pha/pla instructions) but push the arguments AFTER the return adress. This means the caller subroutine needs to call a pseudo-subroutine which will push the arguments on the stack and jump to the real subroutine. I don't remember where I saw this idea but it is interesting. Then a subroutine just needs to pull the arguments with PLA as it needs them - so they have to be passed in an order which make sense for the calee. This is interesting and more efficient than the other approaches mentionned above, but still not globally efficient (as opposed to just usign ZP temporaries or registers) for several reason :
  • ROM and is wasted for pusher subroutines (which possibly have many variants of them)
  • Time is wasted because of the extra JMP, as well as the storage to stack (as oposed to the faster zero-page).
  • The callee might need the arguments more than once. In this case, it needs to store the data from stack to either a register or a temporary

CrowleyBluegrass wrote:
Hi there. I was curious whether many here "imitate" function calls using a calling convention in conjunction with the stack?

No, not many here does this. Zero page temporaries combined with registers and possibly the carry flag are the standard way of passing parameters to subroutines, as well as return values from subroutines in the 6502 world.
Nintendo also used the idea to have hardcoded values after the JSR instructions, and the return adress increased automatically by the calee. This allows to save ROM, but wastes time.


Top
 Profile  
 
PostPosted: Sun Aug 04, 2019 10:12 am 
Offline
User avatar

Joined: Thu Mar 31, 2016 11:15 am
Posts: 525
tokumaru wrote:
Why would you need very deep subroutine calls in an RPG? Whatever the reason is, there's nothing preventing you from using several levels of subroutines without a stack, you just can't have more than one instance of each subroutine.

It just depends. I had really deep, complicated subroutines calls with lots of parameters in my hejickle RPG and found that passing arguments with fixed locations was a mistake. If I'd do it all over again, I'd come up with a calling convention before starting :lol: But for most games there's no need to do this. Using fixed locations in memory for variables/parameters works really well most of the time.

I'm talking about using the stack for holding variables BTW, not return addresses. My mistake if that wasn't clear.


Top
 Profile  
 
PostPosted: Sun Aug 04, 2019 10:47 am 
Offline
User avatar

Joined: Fri May 08, 2015 7:17 pm
Posts: 2550
Location: DIGDUG
You CAN use the stack for a quick push/pop buffer, btw.

Save the stack pointer, reset the stack pointer (let's say 10f), push 16 numbers really quick. Restore the stack pointer. Later use a similar quick popping to use those numbers.

But, not for standard function calls. Maybe for VRAM updates.

_________________
nesdoug.com -- blog/tutorial on programming for the NES


Top
 Profile  
 
PostPosted: Sun Aug 04, 2019 11:33 pm 
Offline
User avatar

Joined: Fri Nov 12, 2004 2:49 pm
Posts: 7739
Location: Chexbres, VD, Switzerland
pubby wrote:
It just depends. I had really deep, complicated subroutines calls with lots of parameters in my hejickle RPG and found that passing arguments with fixed locations was a mistake. If I'd do it all over again, I'd come up with a calling convention before starting :lol: But for most games there's no need to do this. Using fixed locations in memory for variables/parameters works really well most of the time.

I'm talking about using the stack for holding variables BTW, not return addresses. My mistake if that wasn't clear.

I don't know but from my personal experience when writing a moderately complex game engine, passing parameters isn't the problem. The problem is more working with temporaries as a whole (although they're used to pass parameters as well - I concede). Typically subroutine 1 uses Temp1-Temp4 then subroutine 2 uses Temp5-Temp8 and calls subroutine 3 which uses Temp1-Temp4 as well. This of courses fails and it's very hard to debug. As such the problem is more "frames" than passing parameters.

Also I doubt RPG engine needs deeper nested calls than an engine for other game genres (I could be wrong here). If anything, it is simpler because there is less data to process per frame and collision detection can be coarse.


Top
 Profile  
 
PostPosted: Mon Aug 05, 2019 5:23 pm 
Offline
User avatar

Joined: Mon Mar 13, 2017 5:21 pm
Posts: 61
I'm by far not the most experienced programmer here, especially when it comes to 6502, so please take my opinion with a grain of salt.
As far as passing parameters to subroutines, I rarely need more than the A, X, & Y registers. X is almost always used as an index into an array or table somewhere, and that leaves me with A & Y for other data that needs to be passed and returned. In the rare situations where I do need more than that, I usually set aside dedicated bytes somewhere in RAM for that specific piece of data. So for example, if a subroutine affects the X & Y coordinates of the camera, I see no reason not to just have the subroutine directly change those two bytes wherever they are buffered in RAM, rather than trying to return them in a more generic way. I realize this "everything is global" approach may get unwieldy after a while with large programs, but after considering several schemes for trying to replicate the way higher level languages deal with local variables and function parameters, it just seemed like the easiest way.


Top
 Profile  
 
PostPosted: Tue Aug 06, 2019 1:19 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 750
For asm your method of breaking down tasks should be different. You don't make "generic functions that take lots of params" its not the 6502 way.

You break it down into tiny blocks that mostly take a,x,y,st.C and return a,x,y,st.C having a function that is set sprite X to {4 bytes }, by the time you push said data and restore the stack and clean up, its going to be cheaper to just inline the code that does it.

there are always exceptions and sometimes you will need to pass a large block of data to a param, but doing it on a stack of even with the variables post the call is "not the best way", typically in such cases you combine all the params into a table and pass the index. A common example that I can think of is a copy X pages of data from Y to Z, so while you might be tempted to do
Code:
jsr copy
.byte pageCount
.word src,dest
..code here

it is much better to do
Code:
PageCopyTable .block
numPages .byte (1,4,7)-1
src .block
  _values = ($8800,$9000,$d800)
  lo .byte <(_values)
  hi .byte >(_values)
.bend
dest .block
src .block
  _values = ($0400,$0800,$1000)
  lo .byte <(_values)
  hi .byte >(_values)
.bend


so rather then
Code:
jsr copyBlock
.byte 1
.word $8800,$0400
you do
Code:
ldx #0
jsr copyBlock

This way if you ever need to duplicate values you can share an index in all the locations, the detination code runs at full speed and the code is smaller all round and you won't lose a few hours because somewhere a pla was missing and the code goes off to no mans' land.

If stack based is truly needed, one can use abs index
Code:
tsx
sta@w $00fd,x ; store first param
....
sta@w $00fc,x ; second param
...

however you must make sure that nothing between the setup and the call uses more than 2 bytes of stack, or you adjust your code such that the function expects param 3/4/5/6.. bytes below the stack and then all follow the rules for that function call.
then in your code you do the same +2 for reading
Code:
 
tsx
lda@w $00ff,x ; read first param
since the stack is not modified there is no "fix-up needed", however the called function can not use the stack otherwise it will trash parameters, which may or may not be safe by the point you do the call, and you may need to add additional room in the param set to compensate, yes this is more brittle and pain to work out than shared memory locations.

But when it comes to "modules" you tend to make a static struct somewhere, and then fill it with the data you need for the "collection" of calls, collision for example, you set up the first objects position into Collision.firstX, Collision.firstY, Collision.firstWidth, Collision.firstHeight and then call a function that iterates though the other objects which fills in the other data, then calls various functions that reference the Collision stuct. As it is shared over many calls, the loading time to the static struct is easily won back on each of the "functions" being able to load an abs address and not having to look it up each time.


Top
 Profile  
 
PostPosted: Mon Aug 12, 2019 12:13 am 
Offline
User avatar

Joined: Fri Mar 16, 2018 1:52 pm
Posts: 93
Location: Finland
Not sure if this was mentioned, but I remember seeing some games using multiple stack positions. The primary stack at the usual $1FF and other fixed stack locations for different purposes. Those games used this as a buffer for data to update to VRAM, but can be used elsewhere as well.

Code:
tsx
stx stack_ptr
ldx #$10           ; Fixed point in stack
txs

lda $CE56           ; Load whatever data
pha
lda $CE57
pha                   ; Function takes 2 patameters

ldx stack_ptr
txs

; Maybe some code here

jsr Function


Then at the subroutine:

Code:
Function:
ldx #$0E         ; We end back to $0110 after pulling twice
txs

pla
sta temp_var
pla
clc
adc temp_var         ; Do whatever with the bytes
sta result

ldx stack_ptr
txs

rts


Last edited by SusiKette on Mon Aug 12, 2019 6:30 am, edited 1 time in total.

Top
 Profile  
 
PostPosted: Mon Aug 12, 2019 4:16 am 
Offline

Joined: Tue Feb 07, 2017 2:03 am
Posts: 750
SusiKette wrote:
Code:
ldx #$0110           ; Fixed point in stack
txs

that is a 16 bit load, not on a 6502 your not.. on a 65816, sure make as many stacks as you want around RAM for things. But then you also have ,s so...


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 19 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group