Page 1 of 2

cc65: Unnecessary code when accessing pointers

Posted: Sun Jun 24, 2018 4:42 am
by DRW
Inspired by Banshaku's recent threads, I did some analyzing and I found out that the cc65 compiler is not very efficient when it comes to pointer access, even though this has nothing to do with the architecture and could easily be avoided, if I'm not mistaken.

When I have this simple code snippet:

Code: Select all

extern unsigned char *pNumber;
#pragma zpsym("pNumber")

void __fastcall__ Test(void)
{
    *pNumber = 5;
}
then this is what the compiler turns it into:

Code: Select all

	lda     _pNumber+1
	sta     ptr1+1
	lda     _pNumber
	sta     ptr1
	lda     #$05
	ldy     #$00
	sta     (ptr1),y
	rts
My own pointer is clearly declared as being located in the zeropage:

Code: Select all

#pragma zpsym("pNumber")
--> .importzp	_pNumber
And yet, the compiler feels the need to always copy the pointer values to its own pointer instead of simply doing this:

Code: Select all

	lda     #$05
	ldy     #$00
	sta     (_pNumber),y
	rts
Why is this the case at all? Is there any technical reason for it or is it simply an oversight by the programmer who created the parser?

Is there any way to get the compiler to change this behavior without adding inline Assembly manually?

I compiled with
cc65 -O Test.c
and the situation is the same in the old cc65 from cc65.org as well as the newer version from github.


By the way, if you do more than one variable access, like this:

Code: Select all

    *pNumber = 5;
    *pNumber = 6;
Guess what:

Code: Select all

	lda     _pNumber+1
	sta     ptr1+1
	lda     _pNumber
	sta     ptr1
	lda     #$05
	ldy     #$00
	sta     (ptr1),y
	lda     _pNumber+1
	sta     ptr1+1
	lda     _pNumber
	sta     ptr1
	lda     #$06
	sta     (ptr1),y

Re: cc65: Unnecessary code when accessing pointers

Posted: Sun Jun 24, 2018 4:53 am
by dougeff
I just avoid using pointers like this.

I only used pointers to access individual enemies, and I sent it as a parameter to a function that I wrote a function in assembly to process drawing that enemy's sprite.

I didn't write it until I needed to save cycles.

So, anything that takes too many cycles, I rewrote in assembly, as a fastcall function.

So basically, it translates to...

function(&enemy1);

lda lowbyte.enemy1
ldx highbyte.enemy1
jsr function

Re: cc65: Unnecessary code when accessing pointers

Posted: Sun Jun 24, 2018 5:04 am
by DRW
dougeff wrote:I just avoid using pointers like this.
Well, sometimes you can't avoid using pointers.

(Of course, my current example of accessing a single number through a pointer would be nonsense in a real situation, but it was just a simple minimalistic example to demonstrate the concept.)

For example, my new game will have a whole bunch of enemies, so you cannot program each enemy behavior individually.
Instead, I created a script-based function. It reads the first item from an array and depending on the contents, it reads the next values in a certain way.

For example:
If the current array value is "Move forward", then read the next value as "direction" and the value after that as "number of tiles".
If, instead, the current value is "Wait", read the next value as the number of frames to wait.

Etc.

Same with the level buildup function: Each screen is stored in an array of arbitrary size because each screen can have an arbitrary number of background objects, NPCs, enemies etc. So, I need a pointer to iterate through it until the pointer reads the screen end byte.


How would you do these things without using pointers?

dougeff wrote:So, anything that takes too many cycles, I rewrote in assembly, as a fastcall function.
Well, yeah, writing directly in Assembly is always the best solution, but not wanting to do this is also the thing that's pretty much the reason why people use C to begin with.

And in the current situation, we're not even discussing anything that a C compiler cannot optimize because of the architecture.
In the moment, it's simply the question: Why does the compiler always copy the pointer to its own pointer? Is there any reason for it? And can it be avoided (either by command line options or by a certain code style that we simply remember to always apply to C programs for the NES)?

Re: cc65: Unnecessary code when accessing pointers

Posted: Sun Jun 24, 2018 5:29 am
by dougeff
Well, I used to write inline assembly just like your na_th_an's example*, but I find it "ugly" to see C code with lots of assembly.

You could write a macro that inserts inline assembly to make it "pretty" and more C like.

edit
*example
https://github.com/mojontwins/MK1_NES/b ... enengine.h

Re: cc65: Unnecessary code when accessing pointers

Posted: Sun Jun 24, 2018 6:11 am
by thefox
DRW wrote:Why is this the case at all? Is there any technical reason for it or is it simply an oversight by the programmer who created the parser?
I wouldn't expect any compiler to generate optimal code in all scenarios. If I was writing a code generator I, too, would definitely start by handling the general case (in this case, a pointer from anywhere in the memory space), and only then start thinking about case-specific optimizations like this.

(By the way, no compiler would be doing optimizations like this in the parsing phase. Parsing simply checks the input against the grammar of the language.)

Re: cc65: Unnecessary code when accessing pointers

Posted: Sun Jun 24, 2018 10:18 am
by calima
The compiler lacks optimizations for this case. Nothing you can do, except write a patch.

Re: cc65: Unnecessary code when accessing pointers

Posted: Sun Jun 24, 2018 1:12 pm
by na_th_an
I avoid using pointers in cc65 as well, as I know they tend to behave worse than arrays. Sometimes you have to, as pointed. But it's fun how you better use array access when possible when targetting the 6502 via cc65, but you better use pointer based access when possible when targetting the Z80 via z88dk or SDCC. Sometimes porting is a nightmare because of this :-D

Re: cc65: Unnecessary code when accessing pointers

Posted: Sun Jun 24, 2018 6:15 pm
by Banshaku
@DRW

I checked the code regarding the array of structure and saving the reference was not so bad BUT accessing the data that is referenced by the pointer (2 arrays) causes the compiler to move the data inside PTR1 even though it had the information just before in the last statement.

I guess even though it looked "nicer" code wise at first, I will avoid that pattern after all. I do not really need the array of structures, it just looked better to me.

Re: cc65: Unnecessary code when accessing pointers

Posted: Mon Jun 25, 2018 12:39 am
by DRW
Yeah, looks like every pointer access of any kind does that.

Unfortunately, I still need pointers if a character has a certain movement pattern that is stored in an array.

I wrote some macros for this kind of stuff now, like this:

Code: Select all

#define AsmSetVariableFromZpArrayPointer(variable, zpArrayPointer, index)\
{\
   __asm__("LDY %v", index);\
   __asm__("LDA (%v), Y", zpArrayPointer);\
   __asm__("STA %v", variable);\
}

Re: cc65: Unnecessary code when accessing pointers

Posted: Tue Jun 26, 2018 9:24 am
by gauauu
Just to chime in because I had this same issue with my project: yes, cc65 generates terrible code for pointers. Anything using pointers in a loop will probably need to be written in assembly.

In Robo Ninja Climb, I had a simple loop with some pointers that literally used 80% of a frame with cc65's version. Rewriting in assembly with a tiny bit of optimization dropped it to less than 5% of my frame.

Re: cc65: Unnecessary code when accessing pointers

Posted: Wed Jun 27, 2018 3:49 pm
by DRW
Here's another strange cc65 behavior:

This:

Code: Select all

dest = (src + 3) >> 2;
gets turned into this:

Code: Select all

	ldx     #$00
	lda     _src
	jsr     incax3
	jsr     shrax2
	sta     _dest
Why doesn't the compiler simply use LSR?
It creates perfectly fine code when you turn the shift operator around:

Code: Select all

	lda     _src
	clc
	adc     #$03
	asl     a
	asl     a
	sta     _dest
And if you use the right shift operator, but remove the + 3, then it's fine as well:

Code: Select all

	lda     _src
	lsr     a
	lsr     a
	sta     _dest

Re: cc65: Unnecessary code when accessing pointers

Posted: Wed Jun 27, 2018 3:57 pm
by rainwarrior
That's actually correct. The temporary result src + 3 is implicitly a 16-bit int. The high bits of the result can matter when you shift them down, but they won't matter when you shift them up. Think of (255+3)>>2.

How does it deal with:

Code: Select all

dest = (unsigned char)(src + 3) >> 2;

Re: cc65: Unnecessary code when accessing pointers

Posted: Wed Jun 27, 2018 3:59 pm
by tepples
Is this also incorrect?

Code: Select all

clc
lda src
adc #3  ; C:A ranges from 3 to 258
ror a
lsr a
sta dest

Re: cc65: Unnecessary code when accessing pointers

Posted: Wed Jun 27, 2018 4:01 pm
by rainwarrior
tepples wrote:Is this also incorrect?
No, that's fine, but that's a whole new class of optimization that you've ordered here. (Something about keeping track of not just 8 and 16 bit results, but 9 bit as well...)

Re: cc65: Unnecessary code when accessing pointers

Posted: Wed Jun 27, 2018 4:04 pm
by DRW
Is there any way I can force the compiler to treat this as a byte?