cc65: Array access with int index even if it should be abyte

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

User avatar
DRW
Posts: 2225
Joined: Sat Sep 07, 2013 2:59 pm

cc65: Array access with int index even if it should be abyte

Post by DRW »

I'm tying to access an array with a calculated index.

Variables:

Code: Select all

unsigned char x, y, index, value;
unsigned char array[100];
This code of course transforms the index into an integer and therefore accesses the array in a complicated way, by copying it to ptr1, using ADC to calculate the index and all that stuff:

Code: Select all

void Test1(void)
{
    array[y * 16 + x] = value;
}
This one makes the array access simple since the index is just a byte:

Code: Select all

void Test2(void)
{
    index = y * 16 + x;
    array[index] = value;
}
-->

Code: Select all

	ldy     _index
	lda     _value
	sta     _array,y
However, these two make it complicated again, even though the compiler should know that the index is a single byte:

Code: Select all

void Test3(void)
{
    array[(unsigned char)(y * 16 + x)] = value;
}

void Test4(void)
{
    array[index = y * 16 + x] = value;
}
Is there a way to get the compiler to access the array in the simple way with a byte index, without having to put the index into a variable first?
My game "City Trouble":
Gameplay video: https://youtu.be/Eee0yurkIW4
Download (ROM, manual, artworks): http://www.denny-r-walter.de/city.html
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: cc65: Array access with int index even if it should be a

Post by rainwarrior »

Hmm, I did something like that in NESert golfing. The sprite_add function casted index expressions like that to access the oam buffer... I never checked its generated code, but am disappointed to notice now that it promoted them all to 16-bit, even the +0!

It seems that if the expression is anything more complicated than a single term it fails to reduce it from the 16-bit pointer access.

I dunno, the ability of this compiler to optimize stuff like this is pretty poor. I suspect the cases where it does recognize it can use an absolute index are due to an optimizer pass that looks for very specific patterns, and the addition of extra terms breaks up that pattern. Maybe more patterns could be recognized if you wanted to research that and add code to the compiler to address them, but in general it's not a very versatile system.


Trying to do this with C syntax mangling might be squeezing blood from a stone... usually I would just ignore the problem until it comes up as a real performance issue (i.e. why I didn't notice sprite_add wasn't more efficient: it was still efficient enough for that game), and then when it does I rewrite the affected thing in assembly.

Storing the computed index into a char variable that has #pragma zpsym is maybe about as reasonable a workaround as you'll get if your solution is restricted to C-side only.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: cc65: Array access with int index even if it should be a

Post by rainwarrior »

Looking at that example I linked, I found:

Code: Select all

// original: generates 16-bit indirect pointer for each store
void sprite_add(uint8 tile, uint8 x, uint8 y, uint8 attrib)
{
	oam[(uint8)(oam_pos+2)] = attrib;
	oam[(uint8)(oam_pos+0)] = (uint8)(y-1);
	oam[(uint8)(oam_pos+3)] = x;
	oam[(uint8)(oam_pos+1)] = tile;
	oam_pos += 4;
}

// better: gets an absolute indexed store for 3 of them, at least
void sprite_add(uint8 tile, uint8 x, uint8 y, uint8 attrib)
{
	i = oam_pos; // i is uint8 with #pragma zpsym
	oam[i] = (uint8)(y-1); ++i; // this one still fails to optimize because the y-1 expression confuses the pattern
	oam[i] = tile; ++i;
	oam[i] = attrib; ++i;
	oam[i] = x;
	oam_pos = i+1;
}
Ultimately though, I think the second solution is kinda worthless. It would have taken much less time to write this function in assembly than trying to deduce how to mangle my C to fit the compiler's optimization patterns, and it would perform better too.

As a shortcut I could even just copy and paste the generated assembly for the bad function, so I wouldn't have to look up how to fetch the parameters off the stack, etc. and just fix the actual stores.
User avatar
slembcke
Posts: 172
Joined: Fri Nov 24, 2017 2:40 pm
Location: Minnesota

Re: cc65: Array access with int index even if it should be a

Post by slembcke »

For the OPs question, I've never found a particularly better way to do it either. I have a global zero page variable 'idx' that I use exclusively as a temporary value for indexes. You can use it with the comma operator to make it a little less obtuse, if a little more obscure. I like doing it this way because I can't forget that it's not really a local variable if I call another function that uses it.

Code: Select all

some_array[idx = expression, idx] = whatever;
You still get a wasted zero page load/store, but I think that's the best you are going to get.

@rainwarrior You can optimize your function in C the same way you do it in assembly with an absolute address + an offset. I use the following pattern all the time. It produces exactly the assembly you would expect.

Code: Select all

// better: gets an absolute indexed store for 3 of them, at least
void sprite_add(uint8 tile, uint8 x, uint8 y, uint8 attrib)
{
   i = oam_pos; // i is uint8 with #pragma zpsym
   (oam + 0)[i] = y-1;
   (oam + 1)[i] = tile;
   (oam + 2)[i] = attrib;
   (oam + 3)[i] = x;
   oam_pos = i+4;
}
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: cc65: Array access with int index even if it should be a

Post by rainwarrior »

Ah! Hadn't thought of that one. That's almost how it would look in assemble, heh.
User avatar
DRW
Posts: 2225
Joined: Sat Sep 07, 2013 2:59 pm

Re: cc65: Array access with int index even if it should be a

Post by DRW »

slembcke wrote:

Code: Select all

some_array[idx = expression, idx] = whatever;
That's actually a pretty creative use. Thanks for the suggestion.
My game "City Trouble":
Gameplay video: https://youtu.be/Eee0yurkIW4
Download (ROM, manual, artworks): http://www.denny-r-walter.de/city.html
User avatar
dougeff
Posts: 3079
Joined: Fri May 08, 2015 7:17 pm

Re: cc65: Array access with int index even if it should be a

Post by dougeff »

If optimization is important (and it always is) I would write all this in assembly.

Perhaps an inline assembly macro.

I've seen na_th_an do something like this.
nesdoug.com -- blog/tutorial on programming for the NES
User avatar
slembcke
Posts: 172
Joined: Fri Nov 24, 2017 2:40 pm
Location: Minnesota

Re: cc65: Array access with int index even if it should be a

Post by slembcke »

Related tip:
you can use the pseudo variables __A__ and __AX__ to make shuffling data in and out of inline assembly much easier. Quite handy when you pair it with (abuse?) the comma operator.

Dumb example:

Code: Select all

#define DUMB_MACRO(value) (__A__ = value, asm("xor #$0F"), __A__)
my_variable = DUMB_MACRO(45);
Using those pseudo variables make the compiler select the correct way to load the value you wanted into the register instead of needing to do it with those % format chars. Then at the end of the sequence, you tell the compiler the result is already in registers and handle the store too.
calima
Posts: 1745
Joined: Tue Oct 06, 2015 10:16 am

Re: cc65: Array access with int index even if it should be a

Post by calima »

Is xor a valid alias to eor?
User avatar
dougeff
Posts: 3079
Joined: Fri May 08, 2015 7:17 pm

Re: cc65: Array access with int index even if it should be a

Post by dougeff »

no.

But ca65 has a .XOR operator (for calculations at assemble time of constants)
nesdoug.com -- blog/tutorial on programming for the NES
User avatar
DRW
Posts: 2225
Joined: Sat Sep 07, 2013 2:59 pm

Re: cc65: Array access with int index even if it should be a

Post by DRW »

I didn't know about __A__. This might come in handy.


By the way, I noticed another thing:
Not only does a calculated array index produce bad code. Even setting a calculated value to the array isn't really effective, even if the array is a byte type.


Stuff like this is fine (we assume all variables are global or static):
array[index] = value;

But this isn't:
array[index] = value + 1;

And this time, the comma operator won't help you:
array[index] = (temp = value + 1, temp);

You have to do this for nice short code:
temp = value + 1;
array[index] = temp;


However, the code is perfectly fine when you use a constant array index since it acts like a regular non-array variable. And for some reason, regular non-array variables don't make such a fuss:
array[5] = value + 1;

Using __A__ doesn't help either:
__A__ = value + 1;
array[index] = __A__;


And type conversion are unhelpfull as well:
array[index] = (unsigned char)(value + 1);
My game "City Trouble":
Gameplay video: https://youtu.be/Eee0yurkIW4
Download (ROM, manual, artworks): http://www.denny-r-walter.de/city.html
User avatar
gauauu
Posts: 779
Joined: Sat Jan 09, 2016 9:21 pm
Location: Central Illinois, USA
Contact:

Re: cc65: Array access with int index even if it should be a

Post by gauauu »

Ugh. I think the important lesson here seems to be: cc65 isn't very efficient. If performance matters in the section of code you're working on, assembly is the answer.
User avatar
slembcke
Posts: 172
Joined: Fri Nov 24, 2017 2:40 pm
Location: Minnesota

Re: cc65: Array access with int index even if it should be a

Post by slembcke »

calima wrote:Is xor a valid alias to eor?
Uh... probably not. That's just the sort of shoddy off the top of my head examples I'm known for. LOL
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: cc65: Array access with int index even if it should be a

Post by rainwarrior »

DRW wrote:But this isn't:
array[index] = value + 1;
So, something I was saying above, cc65's optimizations don't occur at a high level. It only has one way to generate the base assembly code, and then only after it's already in assembly form it matches patterns in the assembly code to reduce them.

Basically what you're looking at is that the code generated by assigning a value to an array element generates a known pattern. As soon as you put some extra operations on the value to be assigned, more code ends up inside that doesn't match the pattern. The problem is the order in which things are done, I guess. It creates the pointer to the array, constructs the expression, then assigns it. When the code in the middle of this gets more complex, it can't match the pattern anymore.

You can use --debug-opt-output to observe the process if you're curious.

So... if you expect cc65 to ever be able to handle this optimization step, it probably either requires a whole new high-level optimization system (big overhaul) or a bunch of pattern matches for every kind of intermediate expression you can think of (very cumbersome). I feel like the latter isn't worth doing, and the former requires someone with a lot of interest.
User avatar
DRW
Posts: 2225
Joined: Sat Sep 07, 2013 2:59 pm

Re: cc65: Array access with int index even if it should be a

Post by DRW »

gauauu wrote:Ugh. I think the important lesson here seems to be: cc65 isn't very efficient. If performance matters in the section of code you're working on, assembly is the answer.
Well, performance matters everywhere in the actual action gameplay. But I still wouldn't want to write all this in pure Assembly. That's why I use inline Assembly when necessary and C for the rest. (And a bunch of low level pure Assembly code files, mostly for general purpose functions or NES-hardware-related things.)

However, arrays and pointers are really the only stuff that I noticed that are turned into unnecessarily complicated code.
Pointers are always copied to ptr1, even if your pointer is in the zeropage itself.
And arrays, as we see now, produce bad code whenever either the index or the assigned value isn't an exact byte value, but something that gets promoted to int due to calculations and stuff.

But other than arrays and pointers, I have yet to see a situation where the generated Assembly code from C is so bad that I'd have to use Assembly to cut down the required ROM size and CPU time by a huge amount.
rainwarrior wrote:Basically what you're looking at is that the code generated by assigning a value to an array element generates a known pattern. As soon as you put some extra operations on the value to be assigned, more code ends up inside that doesn't match the pattern.
I'm just a bit confused why even the assigned value makes a difference in calculating the array itself.

The array index, o.k., this one makes sense: Having a byte index or an int index makes a huge difference in the processing. And calculations always get promoted to int, so yeah, I understand why array[byteIndex + 1] has an issue over array[byteIndex].

But I just don't understand why the assigned value makes a difference on how the array's offset is calculated.

Well, looks like I have to create a macro:

Code: Select all

/* Only use this if the calculated value is actually calculated at runtime.
   Example: array[index] = variable + 1;
   If you assign an unaltered variable or a constant value known at compile time:
   array[index] = variable;
   array[index] = 1;
   you can access the array directly.
   This macro is only there because the generated code gets large otherwise. */
#define SetArrayValueFromCalculatedValue(array, index, calculatedValue, tempVariable)\
{\
    (tempVariable) = (calculatedValue);\
    (array)[index] = (tempVariable);\
}
My game "City Trouble":
Gameplay video: https://youtu.be/Eee0yurkIW4
Download (ROM, manual, artworks): http://www.denny-r-walter.de/city.html
Post Reply