KickC Optimizing C-Compiler now supports NES

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

JesperGravgaard
Posts: 15
Joined: Fri Jul 05, 2019 1:41 pm

KickC Optimizing C-Compiler now supports NES

Post by JesperGravgaard » Mon Jun 15, 2020 11:05 pm

Hi all,

KickC is a C-compiler that creates optimized 6502 assembler.

The newest release (version 0.8.2) adds direct support for developing for the NES platform. The compiler includes header-files and linker-files for the NES. It also includes a few example-programs that work in an emulator and on the real platform (tested using N8 everdrive).

You can get it here: https://gitlab.com/camelot/kickc

PS. I am the author of KickC.
Screenshot 2020-06-16 at 07.56.11.png
Screenshot 2020-06-14 at 14.50.10.png

User avatar
aa-dav
Posts: 106
Joined: Tue Apr 14, 2020 9:45 pm
Location: Russia

Re: KickC Optimizing C-Compiler now supports NES

Post by aa-dav » Tue Jun 16, 2020 12:04 am

It's very interesting project!
Gitlab wants me to register to download release, but I have no time for that right now.
But it's interesting how KickC will compile to asm next code:

Code: Select all

void str_cpy( char *dst, char *src )
{
   while ( *dst++ = *src++ );
};
CC65 generates code far from ideal. And your approach looks promising.

JesperGravgaard
Posts: 15
Joined: Fri Jul 05, 2019 1:41 pm

Re: KickC Optimizing C-Compiler now supports NES

Post by JesperGravgaard » Tue Jun 16, 2020 9:25 am

The release can be downloaded without registering on gitlab.

Go to https://gitlab.com/camelot/kickc/-/releases and select either "Assets->Binary" or "Download release".

JesperGravgaard
Posts: 15
Joined: Fri Jul 05, 2019 1:41 pm

Re: KickC Optimizing C-Compiler now supports NES

Post by JesperGravgaard » Tue Jun 16, 2020 10:00 am

I have tried compiling your str_cpy function in both cc65 and KickC

My source:

Code: Select all

char* dst1 = (char*)0x0400;
char* dst2 = (char*)0x0428;

void str_cpy( char *dst, char const *src ) {
   while ( *dst++ = *src++ ) {}
}

void main() {
    str_cpy(dst1, "hello");
    str_cpy(dst2, "world");
}
The ASM generated for main() and str_cpy() by KickC

Code: Select all

// str_cpy(byte* zp(4) dst, byte* zp(2) src)
str_cpy: {
    .label dst = 4
    .label src = 2
  __b1:
    // *dst++ = *src++
    ldy #0
    lda (src),y
    sta (dst),y
    // while ( *dst++ = *src++ )
    lda (dst),y
    inc.z dst
    bne !+
    inc.z dst+1
  !:
    inc.z src
    bne !+
    inc.z src+1
  !:
    cmp #0
    bne __b1
    // }
    rts
}
main: {
    // str_cpy(dst1, "hello")
    lda #<dst1
    sta.z str_cpy.dst
    lda #>dst1
    sta.z str_cpy.dst+1
    lda #<src
    sta.z str_cpy.src
    lda #>src
    sta.z str_cpy.src+1
    jsr str_cpy
    // str_cpy(dst2, "world")
    lda #<dst2
    sta.z str_cpy.dst
    lda #>dst2
    sta.z str_cpy.dst+1
    lda #<src1
    sta.z str_cpy.src
    lda #>src1
    sta.z str_cpy.src+1
    jsr str_cpy
    // }
    rts
    src: .text "hello"
    .byte 0
    src1: .text "world"
    .byte 0
}
The ASM for main() and str_cpy() by CC65 (using optimization options -O -Oi -Or -Cl )

Code: Select all

; ---------------------------------------------------------------
; void __near__ str_cpy (__near__ unsigned char *, __near__ const unsigned char *)
; ---------------------------------------------------------------

.segment        "CODE"

.proc   _str_cpy: near

.segment        "CODE"

        jsr     pushax
L0006:  ldy     #$03
        lda     (sp),y
        tax
        dey
        lda     (sp),y
        sta     regsave
        stx     regsave+1
        clc
        adc     #$01
        bcc     L0008
        inx
L0008:  jsr     staxysp
        lda     regsave
        ldx     regsave+1
        jsr     pushax
        ldy     #$03
        lda     (sp),y
        tax
        dey
        lda     (sp),y
        sta     regsave
        stx     regsave+1
        clc
        adc     #$01
        bcc     L000A
        inx
L000A:  jsr     staxysp
        ldy     #$00
        lda     (regsave),y
        jsr     staspidx
        tax
        bne     L0006
        jmp     incsp4

.endproc

; ---------------------------------------------------------------
; void __near__ main (void)
; ---------------------------------------------------------------

.segment        "CODE"

.proc   _main: near

.segment        "CODE"

        lda     _dst1
        ldx     _dst1+1
        jsr     pushax
        lda     #<(L000E)
        ldx     #>(L000E)
        jsr     _str_cpy
        lda     _dst2
        ldx     _dst2+1
        jsr     pushax
        lda     #<(L0012)
        ldx     #>(L0012)
        jmp     _str_cpy

.endproc

User avatar
DRW
Posts: 1984
Joined: Sat Sep 07, 2013 2:59 pm

Re: KickC Optimizing C-Compiler now supports NES

Post by DRW » Tue Jun 16, 2020 11:48 am

Here's an important hint if you want to compare cc65 with your compiler:

Only ever use -O in cc65 for optimizations, nothing else. Combining all those optimization options produces bigger code.
My game "City Trouble": www.denny-r-walter.de/city.htm

JesperGravgaard
Posts: 15
Joined: Fri Jul 05, 2019 1:41 pm

Re: KickC Optimizing C-Compiler now supports NES

Post by JesperGravgaard » Tue Jun 16, 2020 12:14 pm

Thanks DRW. I have seen -Cl produce better code on other examples, so that is why I included it.

I have recompiled with only -O. The code cc65 generates is shorter, but does not seem any faster.

Code: Select all

; ---------------------------------------------------------------
; void __near__ str_cpy (__near__ unsigned char *, __near__ const unsigned char *)
; ---------------------------------------------------------------
.segment        "CODE"
.proc   _str_cpy: near
.segment        "CODE"
        jsr     pushax
L0006:  ldy     #$03
        jsr     ldaxysp
        sta     regsave
        stx     regsave+1
        jsr     incax1
        ldy     #$02
        jsr     staxysp
        lda     regsave
        ldx     regsave+1
        jsr     pushax
        ldy     #$03
        jsr     ldaxysp
        sta     regsave
        stx     regsave+1
        jsr     incax1
        ldy     #$02
        jsr     staxysp
        ldy     #$00
        lda     (regsave),y
        jsr     staspidx
        tax
        bne     L0006
        jmp     incsp4
.endproc

User avatar
Goose2k
Posts: 107
Joined: Wed Dec 11, 2019 9:38 pm
Contact:

Re: KickC Optimizing C-Compiler now supports NES

Post by Goose2k » Tue Jun 16, 2020 3:44 pm

Any benefits to using this over cc65? I'm currently using cc65, which is why I ask. :D

User avatar
DRW
Posts: 1984
Joined: Sat Sep 07, 2013 2:59 pm

Re: KickC Optimizing C-Compiler now supports NES

Post by DRW » Tue Jun 16, 2020 5:13 pm

JesperGravgaard wrote:
Tue Jun 16, 2020 12:14 pm
I have seen -Cl produce better code on other examples, so that is why I included it.
Yeah, the -Cl parameter is a tricky thing: It makes all local variables static, that's why it produces better code.

I myself would advise against using this parameter because I think that compiler flags shouldn't change the meaning of the code. If I want local static variables, I declare them as such, I don't let the compiler do this. This has nothing to do with optimization, that's basically redefining the language since it makes changes on the source level.
JesperGravgaard wrote:
Tue Jun 16, 2020 12:14 pm
I have recompiled with only -O. The code cc65 generates is shorter, but does not seem any faster.
Yeah, that's probably because now the compiler uses more of its own internal helper functions instead of inlining them.

As long as the code of one optimization flag isn't noticeable slower than of another optimization flag, I would say that shorter code is also an important benefit that needs to be considered. Because programming an NROM game has only limited space. And even on a mapper game, you want as much code as possible on the fixed bank. So, speed and ROM size are important.
My game "City Trouble": www.denny-r-walter.de/city.htm

User avatar
aa-dav
Posts: 106
Joined: Tue Apr 14, 2020 9:45 pm
Location: Russia

Re: KickC Optimizing C-Compiler now supports NES

Post by aa-dav » Tue Jun 16, 2020 6:11 pm

JesperGravgaard wrote:
Mon Jun 15, 2020 11:05 pm
...
Excellent!
It relly generates almost ideal code for str_cpy. Cool!
And as I see it rearranges parameters, so, different procedures use the same memory locations if it is applicable.
I like it. It really looks like good alternative for CC65.
Also I tried to help it do more optimizations like this:

Code: Select all

void str_cpy( char *dst, char *src )
{
	while ( true )
	{
		char tmp = *src;
		if ( !tmp )
			break;
		*dst = tmp;
		++dst;
		++src;
	}
}
But result still shows potential to optimize code even more:

Code: Select all

// str_cpy(byte* zp(4) dst, byte* zp(2) src)
str_cpy: {
    .label dst = 4
    .label src = 2
  __b2:
    ldy #0
    lda (src),y
    cmp #0
    bne __b3
    rts
  __b3:
    ldy #0
    sta (dst),y
    inc.z dst
    bne !+
    inc.z dst+1
  !:
    inc.z src
    bne !+
    inc.z src+1
  !:
    jmp __b2
}
ldy #0 can be omitted in iterations/repetitions and cmp #0 just right after loading value is not needed.
However I see that temporary variable is effectively eliminated from memory and it's cool!
This processor doen't make optimizations easy task and this is impressive result for me.

JesperGravgaard
Posts: 15
Joined: Fri Jul 05, 2019 1:41 pm

Re: KickC Optimizing C-Compiler now supports NES

Post by JesperGravgaard » Wed Jun 17, 2020 1:17 am

Goose2k wrote:
Tue Jun 16, 2020 3:44 pm
Any benefits to using this over cc65? I'm currently using cc65, which is why I ask. :D
cc65 is a very mature C-compiler that works really well. It offers more stability than KickC, since KickC is still being developed. There are also more resources available online on how to use cc65 than KickC.

The goal of KickC is to produce optimized (and readable) ASM for your C-programs. The main advantage of using KickC is that your programs run faster. This for instance allows you get more stuff done in the ~30000 CPU cycles per frame.

KickC is built on top of KickAssembler. A secondary advantage is that you can utilize inline KickAssembler to take advantage of the excellent macro and data facilities KickAssembler offers for importing data such as images into your programs.

Getting started on a NES program with KickC is pretty easy, since it includes linker files (for a 16KB PRG-ROM, 8KB CHR-ROM cart), header files with the NES registers and a few examples programs. It even allows you to launch the compiled NES program directly in the nestopia emulator using the -e commandline switch.

Code: Select all

kickc.sh -e examples/nes-demo/nes-demo.c
The included NES example program is pretty readable based on the excellent first_nes written by Greg M. Krsak.

https://gitlab.com/camelot/kickc/-/blob ... nes-demo.c
Last edited by JesperGravgaard on Wed Jun 17, 2020 12:50 pm, edited 2 times in total.

JesperGravgaard
Posts: 15
Joined: Fri Jul 05, 2019 1:41 pm

Re: KickC Optimizing C-Compiler now supports NES

Post by JesperGravgaard » Wed Jun 17, 2020 1:23 am

aa-dav wrote:
Tue Jun 16, 2020 6:11 pm
It relly generates almost ideal code for str_cpy. Cool!
...
ldy #0 can be omitted in iterations/repetitions and cmp #0 just right after loading value is not needed.
However I see that temporary variable is effectively eliminated from memory and it's cool!
This processor doen't make optimizations easy task and this is impressive result for me.
Thank you for the kind words. I am pretty happy with the optimization results achieved so far. It is based on some pretty modern compiler techniques such as single-static-assignment, variable live range analysis, register coluring, etc. On top of that is has an ASM fragment system that allows it to use the 3 registers optimally.

The ASM peephole optimizer however still needs some work. I have a plan for implement a new better ASM optimizer that can move loads out of loops (such as the ldy#0) and remove unnecessary compares (such as cmp#0).

turboxray
Posts: 115
Joined: Thu Oct 31, 2019 12:56 am

Re: KickC Optimizing C-Compiler now supports NES

Post by turboxray » Wed Jun 17, 2020 7:04 am

DRW wrote:
Tue Jun 16, 2020 5:13 pm
I myself would advise against using this parameter because I think that compiler flags shouldn't change the meaning of the code. If I want local static variables, I declare them as such, I don't let the compiler do this. This has nothing to do with optimization, that's basically redefining the language since it makes changes on the source level.
Why would someone be against it, as long as they know what it's doing. By default, I'm against someone using C on these old 8bit machines.. as long as you know what you're doing. This is no different. I'd say anything that helps get the performance loss of compiled C on the 65x is a welcome plus.

Also, if the more advance optimization flag results in larger code then that's fine (that's pretty much how it works on the assembly side of things too), as long as it's faster. Rather than telling people "never use it", explain what the trade offs are. Just because you find that trade off of lesser value, doesn't mean someone else will as well.


To the OP: Thanks for posting this. I saw KickC some months back and was curious about it.

User avatar
DRW
Posts: 1984
Joined: Sat Sep 07, 2013 2:59 pm

Re: KickC Optimizing C-Compiler now supports NES

Post by DRW » Wed Jun 17, 2020 11:09 am

turboxray wrote:
Wed Jun 17, 2020 7:04 am
Why would someone be against it, as long as they know what it's doing.
Because someone who reads the code doesn't know what it's doing unless he also knows the compiler flags. This shouldn't be the case. There's a fundamental difference between int i = 5; and static int i = 5; and the compiler flags shouldn't obscure this.

turboxray wrote:
Wed Jun 17, 2020 7:04 am
By default, I'm against someone using C on these old 8bit machines..
That's a totally different topic altogether and is not an analogue to my issue.

turboxray wrote:
Wed Jun 17, 2020 7:04 am
I'd say anything that helps get the performance loss of compiled C on the 65x is a welcome plus.
You can get this performance plus quite easily: Simply use the static keyword.

Internally, the compiler may work as it wants and produce machine code in any efficient way possible. But the compiler should not change the meaning of the code files on the source level.

This:

Code: Select all

void Test(void)
{
    int i = 0;

    ++i;
}
has nothing to do with optimization. Using the "always static" flag changes the meaning of what the code logic is doing.

turboxray wrote:
Wed Jun 17, 2020 7:04 am
Also, if the more advance optimization flag results in larger code then that's fine (that's pretty much how it works on the assembly side of things too), as long as it's faster. Rather than telling people "never use it", explain what the trade offs are. Just because you find that trade off of lesser value, doesn't mean someone else will as well.
I did exactly that, didn't I? I said using these compiler flags produces larger code. And I explicitly mentioned that, as long as the smaller code isn't slower, you should take care of speed and size because NROM, banks etc.
It's all there in my post, you just need to read it.


@JesperGravgaard:

Do you plan of making the C code syntax compatible with cc65? (Specifically implementing the pragma stuff that puts code and data in certain segments.)

How do segments work in your compiler in general? Can I easily put code into special segments and have the absolute memory addresses of these segments in a separate config file?
My game "City Trouble": www.denny-r-walter.de/city.htm

JesperGravgaard
Posts: 15
Joined: Fri Jul 05, 2019 1:41 pm

Re: KickC Optimizing C-Compiler now supports NES

Post by JesperGravgaard » Wed Jun 17, 2020 12:46 pm

@DRW

KickC uses the segment functionality from KickAssembler. KickAssembler has a rich and expressive segment system that allows you to define different named segments and place them in memory (at absolute positions, after each other, on top of each other, ...). It also has functionality to flexibly generate output-files from the segments defined. See more in the KickAssembler manual here for details http://www.theweb.dk/KickAssembler/

In KickC you can use one of the pre-defined linker-files (using #pragma target() or commenaline option -t ) or define your own linker-file (using #pragma link() or commandline option -T).

The KickAssembler segment system allows you to define the NES ROM memory layout and ROM file format directly in the linker file. You can see the NES linker file included with KickcC here. https://gitlab.com/camelot/kickc/-/blob ... get/nes.ld

Inside the C-program you can choose which segment you want your code to go to by using #pragma code_seg() and which segment your data goes to using #pragma data_seg().

You can see how different data segments are used for ROM-data, RAM-data, Vectors etc. here: https://gitlab.com/camelot/kickc/-/blob ... nes-demo.c

The syntax is close to cc65 but not 100% identical. Probably a few well-chosen macros could make C-code work on both compilers :)

User avatar
rainwarrior
Posts: 7878
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: KickC Optimizing C-Compiler now supports NES

Post by rainwarrior » Wed Jun 17, 2020 5:38 pm

turboxray wrote:
Wed Jun 17, 2020 7:04 am
Why would someone be against it, as long as they know what it's doing. By default, I'm against someone using C on these old 8bit machines.. as long as you know what you're doing. This is no different. I'd say anything that helps get the performance loss of compiled C on the 65x is a welcome plus.
Personally, the reason I'd never use it is that it takes away your ability to make anything not static, which means that all temporary variables now get a permanent place in RAM. This eats up space in RAM very quickly.

The alternative is just to type static as needed for local variables, which is pretty easy to do. (Or even better... define some shared temporary variables on ZP in assembly, e.g. "i" and "j" and import them with zeropage pragma, giving the benefit of both static and ZP.)

So, it's not really that helpful overall, at least on the NES. For a small program on a 6502 system with lots of available RAM, maybe it'd have some low-maintenance speed benefit, but if you have any kind of need to manage RAM it's a really inappropriate global setting.

Also, it means you can't write recursive/re-entrant functions... though you can manually turn it on and off with #pragma static-locals, but even if you wanted it globally it'd probably be better to use that pragma to put it explicitly in the code, rather than as a compiler flag... but then again if you're willing to do something like that explicitly in your source, the "static" keyword is already natural and standard C for this. Why bother with a pragma extension?

Post Reply