VBCC Optimizing C-compiler now supports NES

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

rox_midge
Posts: 89
Joined: Mon Sep 19, 2005 11:51 am

Re: VBCC Optimizing C-compiler now supports NES

Post by rox_midge » Wed Aug 05, 2020 5:56 pm

Does sprintf (or any other function needing vargs) not work at the moment / not work at all? It seems to copy the format string if there are no placeholders and no vargs supplied, but otherwise appears to do nothing at all.

I really don't even need sprintf, I just need a quick way to convert an int / unsigned char to a string - if there's another way (itoa, etc.), I'm glad to skip sprintf entirely.

vbc
Posts: 52
Joined: Sun Jun 21, 2020 5:03 pm

Re: VBCC Optimizing C-compiler now supports NES

Post by vbc » Wed Aug 05, 2020 11:29 pm

rox_midge wrote:
Wed Aug 05, 2020 5:56 pm
Does sprintf (or any other function needing vargs) not work at the moment / not work at all? It seems to copy the format string if there are no placeholders and no vargs supplied, but otherwise appears to do nothing at all.

I really don't even need sprintf, I just need a quick way to convert an int / unsigned char to a string - if there's another way (itoa, etc.), I'm glad to skip sprintf entirely.
It should work just fine. I just verified this example:

Code: Select all

#include <stdio.h>

char buf[100];

main()
{
 sprintf(buf,"test %d %c %s",123,'X',"test");
 puts(buf);
}

timschuerewegen
Posts: 33
Joined: Wed Dec 04, 2019 10:42 am

Re: VBCC Optimizing C-compiler now supports NES

Post by timschuerewegen » Sat Aug 08, 2020 2:13 am

lazyhello_chars_1_byte.zip
(1.06 KiB) Downloaded 20 times

Code: Select all

...
vasm vobj output module 0.9b (c) 2002-2018 Volker Barthelmann

chars(acrwx1):             1 byte
...
It should be 65537 bytes. Where did the other 64K go? :)

Also in config\nrom256v there is

Code: Select all

...
-asv=vasm6502_oldstyle -nowarn=62 -Fvobj -opt-branch -v %s -o %s
...
but -v is not a valid option.

EDIT: and "vlink -v" doesn't actually link, it only prints some information

Code: Select all

...
-ldv=vlink -v -b rawbin1 -Cvbcc -T%%VBCC%%/targets/6502-nes/nrom256v.cmd -L%%VBCC%%/targets/6502-nes/lib %%VBCC%%/targets/6502-nes/lib/startup.o %s %s  -o %s -lvc
...
EDIT: FYI. I'm trying to make vlink output a 8MB .nes file with large binary data at specific offsets.
Last edited by timschuerewegen on Sun Aug 09, 2020 11:07 am, edited 1 time in total.

rox_midge
Posts: 89
Joined: Mon Sep 19, 2005 11:51 am

Re: VBCC Optimizing C-compiler now supports NES

Post by rox_midge » Sat Aug 08, 2020 7:13 pm

vbc wrote:
Wed Aug 05, 2020 11:29 pm
rox_midge wrote:
Wed Aug 05, 2020 5:56 pm
Does sprintf (or any other function needing vargs) not work at the moment / not work at all? It seems to copy the format string if there are no placeholders and no vargs supplied, but otherwise appears to do nothing at all.
It should work just fine. I just verified this example:
I got it working now, thanks. In my case it seems like I wasn't copying the initializers over, which is fine for all of my code (everything happens to be initialized to zero) but seemed to be causing an issue with sprintf somehow. I reverted to the original configuration file and that corrected the issue.

User avatar
Banshaku
Posts: 2393
Joined: Tue Jun 24, 2008 8:38 pm
Location: Japan
Contact:

Re: VBCC Optimizing C-compiler now supports NES

Post by Banshaku » Sun Aug 09, 2020 2:23 am

rox_midge wrote:
Sat Aug 08, 2020 7:13 pm
I got it working now, thanks. In my case it seems like I wasn't copying the initializers over, which is fine for all of my code (everything happens to be initialized to zero) but seemed to be causing an issue with sprintf somehow. I reverted to the original configuration file and that corrected the issue.
If you used one of my config file/makefile with the startup then yes, it could have impact with the c runtime. The goal is to remove all init code for the runtime to reduce the footprint of the C compiler as much as possible so you cannot really use the clib in that case. For using it, the original startup file is necessary.

User avatar
Memblers
Site Admin
Posts: 3877
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Re: VBCC Optimizing C-compiler now supports NES

Post by Memblers » Sun Aug 09, 2020 6:23 am

I see a couple missing zeropage optimizations in lazynes. You can see variables at $02 and $04 are used with absolute addressing. I've only looked at lnAddSpr, I don't know if it happens elsewhere.
lazynes_lnAddSpr.png
compiled with vc +nrom256v -+ -O3
Also looks like that in the bubbles demo.

User avatar
Lazycow
Posts: 105
Joined: Tue Jun 11, 2013 1:04 pm
Location: Germany
Contact:

Re: VBCC Optimizing C-compiler now supports NES

Post by Lazycow » Sun Aug 09, 2020 10:40 am

Memblers wrote:
Sun Aug 09, 2020 6:23 am
I see a couple missing zeropage optimizations in lazynes.
Oops, you're right - I forgot to add the line "zpage r0,r1,r2,r3,r4,r5" for the register parameters. (lazyNES is written in assembler) That will reduce the size of the liblazynes.a file by 16 bytes. (fix should be included in lazyNES V1.0.2)

Will it increase the performance of the bubbles demo? Drumroll... No, still 38 bubbles... 8-)
Memblers wrote:
Mon Jul 06, 2020 3:00 am
Lazycow, can you show us a code size comparison of the bubbles demo?
Sorry for the delay, I think the bubbles demo isn't big enough to show representative numbers. Anyway, here're the numbers for the latest version of vbcc6502... (I counted the bytes in the main ROM, exclusive CHR)
- cc65 -Oris -Cl (14 bubbles) 2725 bytes
- vbcc -O3 (37 bubbles) 3247 bytes
- vbcc -O2 -size (31 bubbles) 2542 bytes
- vbcc -O2 -size -Dmain=__main (31 bubbles) 2366 bytes (the -D option selects small startup code)

for some reason, -O still creates faster code for the bubbles demo than -O3:
- vbcc -O (38 bubbles) 2585 bytes

But only in the NES version. When I'm compiling the C64 version of the bubbles demo, -O3 generates the fastest code. That should be a reminder that the bubbles demo is only an improvised benchmark. The numbers for the C64 version might be interesting as well:
- C64: cc65 -Oris -Cl: 5756 bytes (7 bubbles)
- C64: vbcc6502 -O3: 6665 bytes (20 bubbles)
- C64: vbcc6502 -O2 -size: 5199 bytes (17 bubbles)

vbc
Posts: 52
Joined: Sun Jun 21, 2020 5:03 pm

Re: VBCC Optimizing C-compiler now supports NES

Post by vbc » Sun Aug 09, 2020 2:00 pm

timschuerewegen wrote:
Wed Aug 05, 2020 2:45 am
Found another bug :)

Code: Select all

#include <lazynes.h>

#define IO8(addr) (*(volatile ubyte *)(addr))

void test(const ubyte *data)
{
	ubyte x = 0;
	IO8(0x484F) = 0xAA;
	IO8(0x4830+x) = data[0];
	IO8(0x4831+x) = data[1];
	IO8(0x4832+x) = data[2];
	IO8(0x484F) = 0xBB;
}

int main()
{
	static const ubyte DATA[] = { 1, 2, 3 };
	test(DATA);
	return 0;
}
vc +nrom256v -+ -O3 main.c -o main.nes
=> not ok
W 484F AA BB

vc +nrom256v -+ -O2 main.c -o main.nes
=> ok
W 484F AA
W 4830 01
W 4831 02
W 4832 03
W 484F BB
A typo in the const-memcpy inlining optimization. Should be fixed in the new update I have uploaded.

Thanks for the report.

vbc
Posts: 52
Joined: Sun Jun 21, 2020 5:03 pm

Re: VBCC Optimizing C-compiler now supports NES

Post by vbc » Sun Aug 09, 2020 2:29 pm

timschuerewegen wrote:
Sat Aug 08, 2020 2:13 am
lazyhello_chars_1_byte.zip

Code: Select all

...
vasm vobj output module 0.9b (c) 2002-2018 Volker Barthelmann

chars(acrwx1):             1 byte
...
It should be 65537 bytes. Where did the other 64K go? :)
The target address size for the 6502 is set to 16bit, so the numbers are output as 16bit. If you use sections >64KB, you will likely run into problems with vasm6502 as label values are always 16bit.
Also in config\nrom256v there is

Code: Select all

...
-asv=vasm6502_oldstyle -nowarn=62 -Fvobj -opt-branch -v %s -o %s
...
but -v is not a valid option.

EDIT: and "vlink -v" doesn't actually link, it only prints some information
Of course. Seems it was a long time since I last used -vv...
I have fixed the config files in the new update.
EDIT: FYI. I'm trying to make vlink output a 8MB .nes file with large binary data at specific offsets.
"Large binary data" meaning contiguous blocks >64KB? You probably will run into trouble if you are creating those blocks with vasm6502. If you do not want to split the data into smaller sections, it is probably better to create the object for the binary data with a 32bit vasm like vasmx86. vlink can link such an object file together with 6502 objects.

timschuerewegen
Posts: 33
Joined: Wed Dec 04, 2019 10:42 am

Re: VBCC Optimizing C-compiler now supports NES

Post by timschuerewegen » Sun Aug 09, 2020 2:44 pm

vbc wrote:
Sun Aug 09, 2020 2:00 pm
A typo in the const-memcpy inlining optimization. Should be fixed in the new update I have uploaded.
Thanks. Seems to work.
vbc wrote:
Sun Aug 09, 2020 2:29 pm
"Large binary data" meaning contiguous blocks >64KB? You probably will run into trouble if you are creating those blocks with vasm6502. If you do not want to split the data into smaller sections, it is probably better to create the object for the binary data with a 32bit vasm like vasmx86. vlink can link such an object file together with 6502 objects.
Thanks for the tip. I will probably use another method for building the 8MB .nes file.
vbc wrote:
Mon Aug 03, 2020 2:58 pm
Local labels in vasm only work between other labels. As the code is made out of several separate __asm statements, some non-local labels will be inserted by the compiler (without optimizatio), causing problems with local labels. You can use a single __asm statement to prevent this, e.g.:

Code: Select all

        __asm(" ldx #$10\n"
              ".1:\n"
              " dex\n"
              " bne .1");
If I don't use "__noinline" on a function that contains a single "__asm" statement with a label, and that is being called multiple times from another function, and compile it with -O3 then the asm code gets inlined and there end up being multiple ".1" labels, causing the following errors. I know I can avoid it by using "__noinline" but I just wanted to mention this in case this is a bug and not a feature :)

Code: Select all

error 75 in line 4008 of "C:\Users\Tim\AppData\Local\Temp\vbcc068d.asm": label < l824 1> redefined
>.1:

vbc
Posts: 52
Joined: Sun Jun 21, 2020 5:03 pm

Re: VBCC Optimizing C-compiler now supports NES

Post by vbc » Sun Aug 09, 2020 5:12 pm

timschuerewegen wrote:
Sun Aug 09, 2020 2:44 pm
If I don't use "__noinline" on a function that contains a single "__asm" statement with a label, and that is being called multiple times from another function, and compile it with -O3 then the asm code gets inlined and there end up being multiple ".1" labels, causing the following errors. I know I can avoid it by using "__noinline" but I just wanted to mention this in case this is a bug and not a feature :)
It's a feature. :-)

vbcc can inline or unroll code with inline assembly. You can use the inline/einline directives of vasm to make the labels reusable. See string.h for examples.

User avatar
Memblers
Site Admin
Posts: 3877
Joined: Mon Sep 20, 2004 6:04 am
Location: Indianapolis
Contact:

Re: VBCC Optimizing C-compiler now supports NES

Post by Memblers » Mon Aug 10, 2020 5:07 pm

I made a quick demo to see how many sprites I could get bouncing on the screen. The answer is 60, almost all of them. It uses lazydata.s included with the lazynes demos.

All the objects use 8.8 fixed point for position and velocity. They only bounce vertically, horizontal bounce is in there but commented out, it reduces the count to 56 if enabled.

vc +nrom256v -+ -O3 balls.c lazydata.s -llazynes -o balls.nes

-O4, -O3, -O2 - 60 objects
-O1 - 48 objects
-O0 - 33 objects

I tried porting it to cc65/neslib. I couldn't get neslib to compile, but was able to built it inside the 8bitworkshop IDE. There's still a bug preventing sprites and BG from appearing, but I was able to measure the CPU usage, and it allowed only 12 objects. I don't know what compiler flags 8bitworkshop uses though, or how to change them.

cc65 (unknown settings) - 12 objects

timschuerewegen - I stole that IO8 macro, hope you don't mind. That helps, the stuff I was trying before kept turning into an STA (),y write.
Attachments
balls.c
(3.28 KiB) Downloaded 27 times
balls.nes
(40.02 KiB) Downloaded 23 times

User avatar
Lazycow
Posts: 105
Joined: Tue Jun 11, 2013 1:04 pm
Location: Germany
Contact:

Re: VBCC Optimizing C-compiler now supports NES

Post by Lazycow » Tue Aug 11, 2020 10:28 am

Nice demo using the lazynes lib! :D

I recompiled it, using the cc65 version of the lazynes lib and got 18 balls. (the rand() function was missing for cc65, added the rand() from shiru's neslib)
balls-cc65.zip (cartridge binary and source)

Here's the cc65 version of the lazynes lib. (BETA) I only ported it to cc65 for the bubbles demo. But ok, maybe it's useful for something...
lazynes-cc65beta.tip (BETA, not all demos included)

timschuerewegen
Posts: 33
Joined: Wed Dec 04, 2019 10:42 am

Re: VBCC Optimizing C-compiler now supports NES

Post by timschuerewegen » Wed Aug 12, 2020 10:54 am

Code: Select all

const unsigned char DATA1[] =
{
	0x12, 0x34
};

const unsigned char *DATA2[] =
{
	DATA1, DATA1, DATA1, DATA1
};
compiles to

Code: Select all

	...
	global	_DATA1
	section	rodata
_DATA1:
	byte	18
	byte	52
	global	_DATA2
	section	data
_DATA2:
	word	_DATA1
	word	_DATA1
	word	_DATA1
	word	_DATA1
	...
DATA1 => section rodata (ok)
DATA2 => section data (not ok, should also be rodata, no?)

vbc
Posts: 52
Joined: Sun Jun 21, 2020 5:03 pm

Re: VBCC Optimizing C-compiler now supports NES

Post by vbc » Wed Aug 12, 2020 1:07 pm

timschuerewegen wrote:
Wed Aug 12, 2020 10:54 am

Code: Select all

const unsigned char DATA1[] =
{
	0x12, 0x34
};

const unsigned char *DATA2[] =
{
	DATA1, DATA1, DATA1, DATA1
};
compiles to

Code: Select all

	...
	global	_DATA1
	section	rodata
_DATA1:
	byte	18
	byte	52
	global	_DATA2
	section	data
_DATA2:
	word	_DATA1
	word	_DATA1
	word	_DATA1
	word	_DATA1
	...
DATA1 => section rodata (ok)
DATA2 => section data (not ok, should also be rodata, no?)
No, DATA2 is defined as an array of pointers to constant char, not an array of constant pointers to char. This should work:

Code: Select all

const unsigned char *const DATA2[] =
{
	DATA1, DATA1, DATA1, DATA1
};

Post Reply