C compiler requests

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

johnwbyrd
Posts: 11
Joined: Sat Mar 13, 2021 11:13 pm

C compiler requests

Post by johnwbyrd » Sat Mar 13, 2021 11:23 pm

Hi, a friend and I have been working on a 6502 port of a popular modern C compiler, assembler, and linker.

We've accomplished a few important milestones, and we are the point where we want to solicit opinions from the NES community about features and implementation details.

What are your preferred mappers or memory layouts?
What are your preferred emulators?
What's your development pipeline?
Anything that you can't do with your current assembler/linker/compiler combination, that you would like to be able to do?
What features would cause you to switch from your current compiler? Performance speedups? Backwards compatibility? Debuggability? Better docs?
Would you be able to write a good-quality demo to show off the features of a new C compiler and linker?

Thanks for your advice and interest.
Last edited by johnwbyrd on Mon Mar 15, 2021 2:33 am, edited 5 times in total.

User avatar
DRW
Posts: 2070
Joined: Sat Sep 07, 2013 2:59 pm

Re: C compiler requests

Post by DRW » Sun Mar 14, 2021 12:03 am

johnwbyrd wrote:
Sat Mar 13, 2021 11:23 pm
What are your preferred mappers or memory layouts?
What are your preferred emulators?
What's your development pipeline?
Why are those things important for compiler development?
Mappers have nothing to do with the language. You simply write values to certain addresses, no different from variable access.
And emulators are even further away from compiler development. Your C code has to generate valid Assembly code. It doesn't matter what emulator plays the game later.

johnwbyrd wrote:
Sat Mar 13, 2021 11:23 pm
Anything that you can't do with your current assembler/linker/compiler combination, that you would like to be able to do?
Whenever I access an array via a pointer:

Code: Select all

// C code:
unsigned char index;
unsigned char *pointer;
unsigned char value;

value = pointer[i];

// Assembly code:
LDY index;
LDA (pointer), Y
STA value
The compiler always first copies my pointer to its own internal pointer variable.
It makes sense because the pointer variable has to be in zeropage.
But the compiler should be able to see that my own variable is already in zeropage and therefore omit the copy process, making the code smaller and faster.


One thing that I'd like to see in your compiler:
Compatibility with cc65.
(Preferrably with the old and the new syntax if possible. For example #pragma bssseg and #pragma bss-name.)
Including the memory layout file.
So that people can compile their games with your compiler to see the differences in a full project.

johnwbyrd wrote:
Sat Mar 13, 2021 11:23 pm
Would you be able to write a good-quality demo to show off the features of a new C compiler and linker?
What do you mean with good quality demo? To test a compiler, only one thing should be important: What does the generated Assembly code look like? And for this, you don't need any kind of actual output in a ROM. You just need to do some language operations and see what the compiler does.
Last edited by DRW on Sun Mar 14, 2021 3:56 am, edited 1 time in total.
My game "City Trouble": www.denny-r-walter.de/city.htm

calima
Posts: 1335
Joined: Tue Oct 06, 2015 10:16 am

Re: C compiler requests

Post by calima » Sun Mar 14, 2021 12:56 am

johnwbyrd wrote:
Sat Mar 13, 2021 11:23 pm
What are your preferred mappers or memory layouts?
Presumably you want to offer bundled samples. As DRW, I don't find that so useful, compatibility with the current de-facto leader would be better.

It's common to place code and assets in banks manually, and to pick the memory layout by the game's needs. No automatic system can match the needs of the platform.
What are your preferred emulators?
Presumably this is about debuggers and debug symbols. I don't use debug symbols for NES development, it's so rare to need the name of a memory address that it's not much bother to look up manually.
What's your development pipeline?
Write in my preferred editor, alt-tab to terminal, type 'm' (alias for make -j13). Presumably this is about IDEs and compatibility with those; things like 8-bit workshop or VS code are not that popular.

If I'm writing some specific algo, I often write it first on my Linux system, using quality tools like valgrind to make sure it's correct before compiling for the NES.
Anything that you can't do with your current assembler/linker/compiler combination, that you would like to be able to do?
Smart packing of memory (BSS/RODATA symbols). This doesn't really exist anywhere, GNU ld at most sorts symbols by alignment, leaving plenty of gaps. Yes, optimal packing algos can be O(n^2) or so, but we're talking 8-bit development here, with banks of 8kb-32kb - N is small.

However, you may find it hard to gain market share. The main lack in cc65, the leading suite, is its low optimization ability. Even gcc-level optimization may not be worth it to spend time migrating, doubly so if it's not compatible with existing banking and layouts. Any artificial barriers on top, like not being open source or not having an essential capability like manual placement, will just pile on blockers.

IOW, cc65 is "good enough". The C parts rarely need max speed, and the amount of asm in a typical modern NES game is not that big.

tepples
Posts: 22335
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: C compiler requests

Post by tepples » Sun Mar 14, 2021 7:22 am

DRW wrote:
Sun Mar 14, 2021 12:03 am
johnwbyrd wrote:
Sat Mar 13, 2021 11:23 pm
What are your preferred mappers or memory layouts?
What are your preferred emulators?
What's your development pipeline?
Why are those things important for compiler development?
Mappers have nothing to do with the language. You simply write values to certain addresses, no different from variable access.
I dispute this for two reasons.

1. The contents of the linker script depend on the mapper. Every linker has a different language in which to write linker scripts, and in my experience on NES, this language rarely if ever allows a linker script to be parametrized based on ROM size. Thus you need a separate linker script for each combination of linker, mapper, and ROM size. As calima mentioned, a compiler developer might want to include example linker scripts for common mappers and ROM sizes.

2. If a compiler can generate the assembly code to handle far calls and far data access transparently to the C program, that would make certain things more convenient for the programmer.
DRW wrote:
Sun Mar 14, 2021 12:03 am
And emulators are even further away from compiler development. Your C code has to generate valid Assembly code. It doesn't matter what emulator plays the game later.
I dispute this for two more reasons.

3. Valid ASM6 code is not valid ca65 code. Either the compiler comes bundled with its own counterpart to Binutils (assembler and linker) or it uses an existing one.

4. Debugging emulators differ in what symbol formats they can read. Even among NES emulators, FCEUX wants NL files, Mesen wants DBG or MLB files, and No$nes wants SYM files.

User avatar
DRW
Posts: 2070
Joined: Sat Sep 07, 2013 2:59 pm

Re: C compiler requests

Post by DRW » Sun Mar 14, 2021 10:52 am

tepples wrote:
Sun Mar 14, 2021 7:22 am
1. The contents of the linker script depend on the mapper.
Well, of course they do. And? Whether I prefer UNROM, MMC1 or MMC3 still has nothing to do with general compiler programming. One config file format should be able to handle all of them and any hypothetical future mapper because it's just a general memory layout file.

tepples wrote:
Sun Mar 14, 2021 7:22 am
As calima mentioned, a compiler developer might want to include example linker scripts for common mappers and ROM sizes.
You really think that was the motivation behind the question? They have no finished compiler yet, but spend their time worrying what example files should go into the download package? If that was the case, I would highly doubt that they have their priorities straight.

tepples wrote:
Sun Mar 14, 2021 7:22 am
2. If a compiler can generate the assembly code to handle far calls and far data access transparently to the C program, that would make certain things more convenient for the programmer.
But this is still just one general feature, isn't it? Is it something you do differently for various mappers, so that the compiler programmer has to decide for which mapper he should include the feature? I still don't see why anybody needs to know here which specific mappers are the most popular.

tepples wrote:
Sun Mar 14, 2021 7:22 am
DRW wrote:
Sun Mar 14, 2021 12:03 am
And emulators are even further away from compiler development. Your C code has to generate valid Assembly code. It doesn't matter what emulator plays the game later.
3. Valid ASM6 code is not valid ca65 code. Either the compiler comes bundled with its own counterpart to Binutils (assembler and linker) or it uses an existing one.
Has literally nothing to do with the question whether I use fceux, Nestopia, Mesen or Nesticle to play the ROM.

tepples wrote:
Sun Mar 14, 2021 7:22 am
4. Debugging emulators differ in what symbol formats they can read. Even among NES emulators, FCEUX wants NL files, Mesen wants DBG or MLB files, and No$nes wants SYM files.
O.k., this is the first valid explanation so far.
My game "City Trouble": www.denny-r-walter.de/city.htm

johnwbyrd
Posts: 11
Joined: Sat Mar 13, 2021 11:13 pm

Re: C compiler requests

Post by johnwbyrd » Sun Mar 14, 2021 3:09 pm

Why are those things important for compiler development?
Mappers have nothing to do with the language. You simply write values to certain addresses, no different from variable access.
Well, as you know, C has a flat memory model. So there are multiple ways to have C support banked memory systems, and although we have opinions on that, we're soliciting opinions on how you prefer to have banked memory represented in C.
And emulators are even further away from compiler development. Your C code has to generate valid Assembly code. It doesn't matter what emulator plays the game later.
Practically speaking, I disagree. Emulators will be important for development and real-time debugging, and so I'm wondering which NES emulators you think are most featureful or most popular for cross development. In particular, we're comtemplating source level debugging connected to an emulated target.

All our work is open source.
Whenever I access an array via a pointer:

Code: Select all

// C code:
unsigned char index;
unsigned char *pointer;
unsigned char value;

value = pointer[i];

// Assembly code:
LDY index;
LDA (pointer), Y
STA value
The compiler always first copies my pointer to its own internal pointer variable.
It makes sense because the pointer variable has to be in zeropage.
But the compiler should be able to see that my own variable is already in zeropage and therefore omit the copy process, making the code smaller and faster.
Which variable are you referring to, which should be in zero page already? value? Are you saying that multiple accesses to value in cc65 do not cache the previous calculation on it in zero page? A more detailed example would be helpful. Come to think of it, examples of code that cc65 fails to do well on, would also be helpful.
Compatibility with cc65. (Preferrably with the old and the new syntax if possible. For example #pragma bssseg and #pragma bss-name.)
Including the memory layout file.
So that people can compile their games with your compiler to see the differences in a full project.
Our linker is ld compatible. So it already has support for all the expected section types, including bss and rodata and similar. And it can handle overlays and similar bank management. We've also got ELF support working as well. We've written example linker scripts for several existing 6502 targets, but next up we want to think about getting NES linker support right, and I know the community has some opinions about that.
What do you mean with good quality demo?
Well, we'd like to see what people can accomplish with the compiler suite -- as you know, the only proof that a thing works as expected, is running code demonstrating that it does what is intended.
Last edited by johnwbyrd on Sun Mar 14, 2021 3:15 pm, edited 1 time in total.

johnwbyrd
Posts: 11
Joined: Sat Mar 13, 2021 11:13 pm

Re: C compiler requests

Post by johnwbyrd » Sun Mar 14, 2021 3:15 pm

tepples wrote:
Sun Mar 14, 2021 7:22 am
As calima mentioned, a compiler developer might want to include example linker scripts for common mappers and ROM sizes.
You really think that was the motivation behind the question? They have no finished compiler yet, but spend their time worrying what example files should go into the download package? If that was the case, I would highly doubt that they have their priorities straight.
We do worry about that. If we can't show practical examples of our compiler working on example projects and test suites, then we can't be sure it's working correctly.

Yes, we're breaking out some example linker scripts for various targets into an SDK, to be used with the compiler and linker. It won't be required to use them, but they will serve as starting points if you want to hack on them or roll your own. I'd like to know which mappers and ROM sizes we need to be targeting out of the gate, and which ones should be left as an exercise.

User avatar
DRW
Posts: 2070
Joined: Sat Sep 07, 2013 2:59 pm

Re: C compiler requests

Post by DRW » Sun Mar 14, 2021 3:29 pm

johnwbyrd wrote:
Sun Mar 14, 2021 3:09 pm
Well, as you know, C has a flat memory model. So there are multiple ways to have C support banked memory systems, and although we have opinions on that, we're soliciting opinions on how you prefer to have banked memory represented in C.
But isn't this still completely mapper-independent? Is there a way that is better for UNROM and another way that is better for MMC3?

But in general, I think the following mappers might be the most popular: NROM, CNROM, UNROM, MMC1, MMC3.

johnwbyrd wrote:
Sun Mar 14, 2021 3:09 pm
Which variable are you referring to, which should be in zero page already?
I'm talking about the pointer variable. If you want to do LDA (pointer), Y, then the pointer has to be in zeropage. So, whenever I do value = pointer[index];, cc65 always first copies the contents of my pointer variable into its own ptr1 variable and then uses prt1 for the data access. But since I put my pointer already into zeropage, this copy process wouldn't be necessary.
My game "City Trouble": www.denny-r-walter.de/city.htm

johnwbyrd
Posts: 11
Joined: Sat Mar 13, 2021 11:13 pm

Re: C compiler requests

Post by johnwbyrd » Sun Mar 14, 2021 3:44 pm

DRW wrote:
Sun Mar 14, 2021 3:29 pm
johnwbyrd wrote:
Sun Mar 14, 2021 3:09 pm
Well, as you know, C has a flat memory model. So there are multiple ways to have C support banked memory systems, and although we have opinions on that, we're soliciting opinions on how you prefer to have banked memory represented in C.
But isn't this still completely mapper-independent? Is there a way that is better for UNROM and another way that is better for MMC3?
I think that, at least, compiled programs would need to know the memory layouts of each of those mappers, and the linker would want to try to put code into the read-only areas.

It looks to me that some emulators prefer iNES and some prefer NES 2.0. Would you prefer the linker to emit one of these two formats, or another format that I am not aware of? Are there other flat binary formats that we need to be aware of for NES?
I'm talking about the pointer variable. If you want to do LDA (pointer), Y, then the pointer has to be in zeropage. So, whenever I do value = pointer[index];, cc65 always first copies the contents of my pointer variable into its own ptr1 variable and then uses prt1 for the data access. But since I put my pointer already into zeropage, this copy process wouldn't be necessary.
So there's an extra copy interspersed where value is copied to another location in zero page, before value gets hold of it? That's not immediately clear to me from the example as given...

User avatar
DRW
Posts: 2070
Joined: Sat Sep 07, 2013 2:59 pm

Re: C compiler requests

Post by DRW » Sun Mar 14, 2021 4:26 pm

johnwbyrd wrote:
Sun Mar 14, 2021 3:44 pm
I think that, at least, compiled programs would need to know the memory layouts of each of those mappers, and the linker would want to try to put code into the read-only areas.
That's what configuration files are for. I.e. the stuff that cc65 is doing right now without having to know shit about any NES mapper.

Do you seriously consider to hardcode this into your compiler itself? How would you decide which of the code goes into which bank?

johnwbyrd wrote:
Sun Mar 14, 2021 3:44 pm
It looks to me that some emulators prefer iNES and some prefer NES 2.0. Would you prefer the linker to emit one of these two formats, or another format that I am not aware of?
Configuration file.

johnwbyrd wrote:
Sun Mar 14, 2021 3:44 pm
So there's an extra copy interspersed where value is copied to another location in zero page, before value gets hold of it? That's not immediately clear to me from the example as given...
"value" is not copied to another location in zeropage. "pointer" is copied to "ptr1", so that cc65 can do LDA (prt1), Y. Because indirect access requires a zeropage pointer, and there's no guarantee that "pointer" is in zeropage.
However, the optimizer could check whether "pointer" is in zeropage and therefore omit the copy process and work with "pointer" directly: LDA (pointer), Y.
My game "City Trouble": www.denny-r-walter.de/city.htm

tepples
Posts: 22335
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: C compiler requests

Post by tepples » Sun Mar 14, 2021 5:02 pm

DRW wrote:
Sun Mar 14, 2021 3:29 pm
johnwbyrd wrote:
Sun Mar 14, 2021 3:09 pm
Well, as you know, C has a flat memory model. So there are multiple ways to have C support banked memory systems, and although we have opinions on that, we're soliciting opinions on how you prefer to have banked memory represented in C.
But isn't this still completely mapper-independent? Is there a way that is better for UNROM and another way that is better for MMC3?
It probably doesn't differ much between UNROM and [the most common configuration of] MMC1 because both provide a single 16 KiB window. Nor does it differ much among VRC4, Namco 108/MIMIC-1, and MMC3, because all three provide two 8 KiB windows. But the difference between 1 window and 2 windows is huge, as it makes it possible to read program code through one and big constant data through the other.
DRW wrote:
Sun Mar 14, 2021 4:26 pm
That's what configuration files are for. I.e. the stuff that cc65 is doing right now without having to know shit about any NES mapper.
Consider a scenario in which you have made your configuration file for UNROM or MMC1, and linking fails because your configuration file has too few switchable windows. Your configuration file has one window, whereas the compiled code assumes two. This is the situation that I presume johnwbyrd is trying to avoid at the requirements phase, when changes are much easier to make than they would be come the beta testing phase.

johnwbyrd
Posts: 11
Joined: Sat Mar 13, 2021 11:13 pm

Re: C compiler requests

Post by johnwbyrd » Sun Mar 14, 2021 5:28 pm

DRW wrote:
Sun Mar 14, 2021 4:26 pm
johnwbyrd wrote:
Sun Mar 14, 2021 3:44 pm
I think that, at least, compiled programs would need to know the memory layouts of each of those mappers, and the linker would want to try to put code into the read-only areas.
That's what configuration files are for. I.e. the stuff that cc65 is doing right now without having to know shit about any NES mapper.
Would you be referring to nes.cfg in cc65? It looks to me that cc65 has some fairly specific opinions as to memory layouts and where RAM vs ROM should go. I was more wondering whether those assumptions were sufficient. My gut tells me that cc65's choices are unnecessarily restrictive.
Do you seriously consider to hardcode this into your compiler itself?
Don't know where you got that idea from.
How would you decide which of the code goes into which bank?
Ultimately, it would be up to whatever linker script you decide to use, to assign sections to memory ranges. We'll probably supply you with a few example linker scripts to get you started.
"value" is not copied to another location in zeropage. "pointer" is copied to "ptr1", so that cc65 can do LDA (prt1), Y. Because indirect access requires a zeropage pointer, and there's no guarantee that "pointer" is in zeropage.
However, the optimizer could check whether "pointer" is in zeropage and therefore omit the copy process and work with "pointer" directly: LDA (pointer), Y.
How are you currently informing cc65 that your pointer is in zero page?

johnwbyrd
Posts: 11
Joined: Sat Mar 13, 2021 11:13 pm

Re: C compiler requests

Post by johnwbyrd » Sun Mar 14, 2021 5:32 pm

It probably doesn't differ much between UNROM and [the most common configuration of] MMC1 because both provide a single 16 KiB window. Nor does it differ much among VRC4, Namco 108/MIMIC-1, and MMC3, because all three provide two 8 KiB windows. But the difference between 1 window and 2 windows is huge, as it makes it possible to read program code through one and big constant data through the other.
What do you think the correct solution is there? Should we provide multiple linker examples for NES, showing how to target both of those? Is one in more common use than the other? Does anyone manufacture carts anymore using either standard?

johnwbyrd
Posts: 11
Joined: Sat Mar 13, 2021 11:13 pm

Re: C compiler requests

Post by johnwbyrd » Sun Mar 14, 2021 5:35 pm

A related question: other than what I see at https://wiki.nesdev.com/w/index.php/Sample_RAM_map , I don't see any formal specifications for how a NES compiler ought to use zero page. Is there some NES standard I am unaware of, for dedicating a range of zero page locations for the C compiler's use?

lidnariq
Posts: 10463
Joined: Sun Apr 13, 2008 11:12 am
Location: Seattle

Re: C compiler requests

Post by lidnariq » Sun Mar 14, 2021 5:59 pm

johnwbyrd wrote:
Sun Mar 14, 2021 5:32 pm
What do you think the correct solution is there? Should we provide multiple linker examples for NES, showing how to target both of those? Is one in more common use than the other? Does anyone manufacture carts anymore using either standard?
UNROM carts are quite common and easily made even now. (And indeed continue to be).

MMC3 was a major option during the NES's commercial life - about 1/4 of US games used it - although modern homebrew often avoids it due to the cost premium.

For targeting modern homebrew, I'd recommend supporting UNROM and A/B/GNROM style bankswitching.
johnwbyrd wrote:
Sun Mar 14, 2021 5:35 pm
I don't see any formal specifications for how a NES compiler ought to use zero page. Is there some NES standard I am unaware of, for dedicating a range of zero page locations for the C compiler's use?
Not really. I've seen people talk about reserving the bottommost bytes for function-local storage or parameter passing. The only hard constraints come from the 6502 core (zero page, stack) and 2A03 augmentations (sprite DMA, DPCM DMA)

Post Reply