cc65 now supports trampolines

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

Post Reply
calima
Posts: 1156
Joined: Tue Oct 06, 2015 10:16 am

cc65 now supports trampolines

Post by calima » Thu May 18, 2017 11:39 am

I needed this for a project, so now cc65 supports trampolines. Makes PRG banking almost nice when you can transparently call anything without manual bankswitching and/or taking care to only call to/from a common bank.

User avatar
rainwarrior
Posts: 7824
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: cc65 now supports trampolines

Post by rainwarrior » Thu May 18, 2017 12:39 pm

Documentation of the new feature:
http://cc65.github.io/doc/cc65.html#ss7.17

That looks nice. I think it could use a little clarification of how the function is called. What does an example "trampoline" routine look like? It says the bank parameter is in tmp4, and the address of the function is in ptr4, but that doesn't really tell the whole story.

I'd imagine it is something like this:
  • 1. Temporarily preserve A/X/Y (arguments) if overwritten by bankswitch.
  • 2. Preserve current bank on the stack.
  • 3. Perform the bankswitch using tmp4.
  • 4. Restore A/X/Y if needed.
  • 5. Call the function via ptr4.
  • 6. Temporarily preserve A/X (return) if overwritten by bankswitch.
  • 7. Fetch the original bank from the stack and bankswitch.
  • 8. Restore A/X if needed.
  • 9. Return.
Would this be a valid trampoline function?

Code: Select all

trampoline:
	; preserve A/X arguments
	sta tmp1
	stx tmp2
	; remember current bank
	lda current_bank
	pha
	; bankswitch (UNROM style)
	lda tmp4
	tax
	sta bank_table, X
	; call function
	lda tmp1
	ldx tmp2
	jsr @call
	; preserve A/X
	sta tmp1
	stx tmp2
	; restore original bank
	pla
	tax
	sta bank_table, X
	; restore A/X return value
	lda tmp1
	ldx tmp2
	rts
@call:
	jmp (ptr4)
You said in some of the github comments that your trampoline is only 10 bytes though. How did you manage that? (Are you implicitly only allowing void functions with no arguments?)

It also seems prudent to put the CRT in a fixed bank (e.g. 16/16 mapper arrangement, UNROM). I saw some comments about putting the trampoline in RAM, but I think 32k banks would be a hassle anyway. You'd have to jump through a few hoops to get the CRT linked in more than one place, probably not worth the effort.

User avatar
rainwarrior
Posts: 7824
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: cc65 now supports trampolines

Post by rainwarrior » Thu May 18, 2017 3:35 pm

I think there is some problem with parameter passing between this and -O?

Code: Select all

void trampoline(void);
#pragma wrapped-call (push, trampoline, 32)
int func_c(long arg)
{
	return arg +3;
}
#pragma wrapped-call (pop)

; without -O func_c(7) produces:
	ldx     #$00
	stx     sreg
	stx     sreg+1
	lda     #$07
	ldy     #32
	sty     tmp4
	ldy     #<(_func_c)
	sty     ptr4
	ldy     #>(_func_c)
	sty     ptr4+1
	jsr     _trampoline

; with -O func_c(7) produces:
	ldy     #32
	sty     tmp4
	ldy     #<(_func_c)
	sty     ptr4
	ldy     #>(_func_c)
	sty     ptr4+1
	jsr     _trampoline
Note the argument setup has entirely disappeared. (Doesn't seem to be a problem with __cdecl__/--standard calls, since they don't use registers.)


There is a maybe related problem with variadic functions. They require the stack size to be passed in Y and the trampoline preamble clobbers it. (Functions without a prototype will also set up Y in this way, though this is a much more obscure case.)


I guess it would be more appropriate to report this at the github project, so I will do that: https://github.com/cc65/cc65/issues/432

calima
Posts: 1156
Joined: Tue Oct 06, 2015 10:16 am

Re: cc65 now supports trampolines

Post by calima » Fri May 19, 2017 2:59 am

Your trampoline is valid, things I do in addition are checking if the bankswitch is needed at all, and jumping to callptr4, which does the same as your @call. The CRT issues would complicate 32k banks, indeed.

10 bytes was just a test trampoline that didn't switch banks, a switching one will be larger.

Thanks for testing, variadic functions will need special handling, or perhaps just disallowing them. The optimization bug needs some attention.

User avatar
rainwarrior
Posts: 7824
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: cc65 now supports trampolines

Post by rainwarrior » Wed Dec 04, 2019 11:26 pm

A thought about the #pragma...

Code: Select all

#pragma wrapped-call (push, <name>, <identifier>)
Since we have .bank for automatically deducing bank information at link time, couldn't the identifier be automated by using that feature?

i.e. if the identifier is omitted, maybe the generated code which uses it could generate a #<.bank on the label instead?

Code: Select all

#pragme wrapped-call (push, <name>)
...which should be backward compatible with the existing usage with explicit identifier, I think.

Would that be a sensible extension of this feature?

calima
Posts: 1156
Joined: Tue Oct 06, 2015 10:16 am

Re: cc65 now supports trampolines

Post by calima » Thu Dec 05, 2019 2:07 am

Yes, though I see one obstacle: for some projects, .bankbyte would make more sense than .bank (no need to add stuff to the linker config if the high bits are already usable for that). So if you add that feature, please include such a toggle (in the file, similar to .smart, in addition to a command line arg would be preferable to a command-line arg only).

Post Reply