Spec for HLL targeting NES

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems.

Moderator: Moderators

strat
Posts: 364
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Spec for HLL targeting NES

Post by strat » Thu Aug 02, 2012 11:04 am

Edit: Back to working on this, made some real progress, spec attached below. Some of the post below is outdated.

Before posting this, I went back and reread this thread (Among others) to see if there were any ideas I overlooked.

http://nesdev.com/bbs/viewtopic.php?t=7976

The op's ideas were pretty close to my own: The syntax is going to be very simple, almost BASIC-like, but it does adapt things from C:

-Pointers are a separate type and must be declared. Uses Indirect-Y.
-Structs are very much in play. An array of structs would use Direct Indexing.
-Local variables will be simulated, but there's no stack manipulation with Indirect.

The plan is to generate asm roughly close to what would be hand-written.

Some issues that might crop-up (Or, things from C that didn't make it):
-No 24-bit or 32-bit vars: These will be implemented when the compiler targets 16-bit cpus. What do you call a 24-bit variable when 'int' is reserved for 32-bits and 'long' for 64-bits? For now, the carry flag is available if more than 16-bits are needed.

-1D arrays only: Readily maps to processor addressing modes. On x86 and Arm, 2D and more arrays would be trivial with their multiply instruction, but Noism will not be targeting those platforms. You'll have to use a pointer table to simulate 2D arrays.

One change to the spec: I'm not crazy about retaining any feature of C pointer syntax, so
ptr pointer_var = &var

might become

ptr pointer_var = [var]

Also, forgot negation (~).

Reasons to use Noism:
-Scoping removes the need to juggle temporary variables
-Create a portable code base for NES and GB (Others to be added later)
-Syntax maps closely to a handful of asm instructions

Right now the compiler does syntax error checking. Once I get the compiler functional, the plan is to release it with at least one demo that will run on both NES and GB. Looking forward to feedback -- Hopefully someone else will use it.
Attachments
noism_spec.txt
(17.52 KiB) Downloaded 131 times
Last edited by strat on Sat Nov 30, 2013 8:23 pm, edited 2 times in total.

Shiru
Posts: 1161
Joined: Sat Jan 23, 2010 11:41 pm

Post by Shiru » Thu Aug 02, 2012 11:25 am

I have a note on the whole idea. HLL is a High Level Language, i.e. very abstracted from the low level, the hardware. So there is a dillema - either you target it for effective resulting code, but this brings some hardware limitations back to the abstaction level (like 1D arrays and lack of 32 bit math), thus compicating use of the language, or you target it to simplify programming a lot, hiding all these details, but this leads to not very effective code.

Another thing, my personal opinion is that some new language, that has syntax far from a popular one (BASIC, C, Java) is doomed to be used by the author and maybe a few other people only - just because people couldn't use their previous experience well, they also can't use this experience later, and it is difficult to get help on an unpopular, new thing.

strat
Posts: 364
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Post by strat » Thu Aug 02, 2012 12:11 pm

This is no doubt an experiment. But if nothing else, it's pretty fun to work on a really simple compiler. The first point I agree with. Effective code is the aim here. One way a xD array might be simulated is to restrict the higher dimensions to a power of 2. Then again, that will result in a crazy amount of bit shifts. The compiler could also create and load the pointer table automatically; I don't like the idea of surprise code, so that might be enabled with a preprocessor option.

I may just go ahead and include 32-bit math. But 24-bit vars are also a necessity.

The second point I see where you're coming from but not really. Most programming languages demand experienced programmers change their habits a bit. Lua had the gall to break the tradition of indexing an array with zero, and those Blizzard guys love it. This language will never be popular anyway unless it gets new people into deving on old systems.

Also, please ignore the embarrassing self-contradiction in section XIV. of the spec.

User avatar
Nioreh
Posts: 116
Joined: Sun Jan 22, 2012 11:46 am
Location: Stockholm, Sweden

Post by Nioreh » Thu Aug 02, 2012 12:13 pm

What's wrong with C style syntax?

User avatar
Dwedit
Posts: 4333
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Post by Dwedit » Thu Aug 02, 2012 12:27 pm

Nothing is really wrong with C-style syntax itself, but C itself has some really backwards features in it.

Lack of forward declarations is the single most annoying part of C and C++. I write a function, then need to copy-paste the first line somewhere else just so I can call it in code that happens to be before the function. That is absolutely ridiculous.

Also, C doesn't have a good way for a function to return multiple values. You can return a struct, but that mainly leads to the compiler throwing it on the stack instead of returning it in several registers.

What else is wrong with C and C++? Assignments in If expression. Infinite while loops because you accidentally put a semicolon before the open brace. The postincrement operator having undefined meaning when there is more than one use of that variable. The wrong order of operations makes bitwise arithmetic lower priority than expected (OR should be like addition, AND and bit shifts should be like multiplication), but they are all low priority instead. Leading zeroes magically make your numbers octal. Tons of annoying legacy crap.
Last edited by Dwedit on Thu Aug 02, 2012 12:35 pm, edited 1 time in total.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!

strat
Posts: 364
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Post by strat » Thu Aug 02, 2012 12:33 pm

Mostly because using pointers without '*' and '&' looks cleaner, and I'm trying to reduce usage of the shift-key.

User avatar
Bregalad
Posts: 7890
Joined: Fri Nov 12, 2004 2:49 pm
Location: Chexbres, VD, Switzerland

Post by Bregalad » Thu Aug 02, 2012 12:42 pm

Dwedit wrote: Lack of forward declarations is the single most annoying part of C and C++. I write a function, then need to copy-paste the first line somewhere else just so I can call it in code that happens to be before the function. That is absolutely ridiculous.
I agree with you on this one !
Also, C doesn't have a good way for a function to return multiple values. You can return a struct, but that mainly leads to the compiler throwing it on the stack instead of returning it in several registers.
To return two values, return a long and bit pack both values in the result.
For more than two values, have a pointer in the argument list that points to where you'd like the function to write its results.
Life is complex: it has both real and imaginary components.

User avatar
Nioreh
Posts: 116
Joined: Sun Jan 22, 2012 11:46 am
Location: Stockholm, Sweden

Post by Nioreh » Thu Aug 02, 2012 12:45 pm

Yeah, things like that can be cumbersome in C. But if I created a new language like this, I would probably keep the syntax as close to C as I could. It will mean the language is easy to pick up for anyone who has done a little programming. Noism doesn't even use return values as far as I can see? Sort of like only allowing for void functions in a C like world.

I do like this project, btw. Since I think assembly is very complicated to do large logic stuff in, this will probably make things easier.

I have been using CC65 a lot, and while it does what I want most of the time, there are some things that really make me want to try something else.

If noism will make it simpler that CC65 to handle bank switching, it can be really useful. I don't have much experience working with bank switching, but in CC65 you basically have to keep your code small enough to fit in one 16K bank, and use the other one for pure data.

If noism also generates more efficient code than CC65, that is also a great thing.

Keep up the good work.

User avatar
Bregalad
Posts: 7890
Joined: Fri Nov 12, 2004 2:49 pm
Location: Chexbres, VD, Switzerland

Post by Bregalad » Thu Aug 02, 2012 1:01 pm

I have been using CC65 a lot, and while it does what I want most of the time, there are some things that really make me want to try something else.
Could you tell us more ?
I had this idea of porting sdcc for the 6502 not long ago, but it will be a large project and I'm not sure I can handle it.
Life is complex: it has both real and imaginary components.

tepples
Posts: 22019
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Spec for HLL targeting NES

Post by tepples » Thu Aug 02, 2012 1:18 pm

strat wrote:Some issues that might crop-up (Or, things from C that didn't make it):
-No 24-bit or 32-bit vars: These will be implemented when the compiler targets 16-bit cpus. What do you call a 24-bit variable when 'int' is reserved for 32-bits and 'long' for 64-bits?
Most C platforms I know of with 16-bit int have 32-bit long. You could use the <stdint.h> names int16_t, uint16_t, int32_t, and uint32_t for variables that stay 16-bit or 32-bit regardless of platform. C doesn't define a 24-bit integer type, but int24_t and uint24_t would be least astonishing to programmers, and they can typedef it to something more convenient (like the s32, u32, s16, and u16 commonly seen in GBA code).
Reasons to use Noism:
[...]
-Create a portable code base for NES and GB (Others to be added later)
Yay! No more DRY violations that are characteristic of ports to some platforms. And if you provide a standard C back end, the result becomes more easily portable to Windows, Mac OS X, desktop Linux, iOS, and Android.
Dwedit wrote:What else is wrong with C and C++? Assignments in If expression. Infinite while loops because you accidentally put a semicolon before the open brace.
I believe several compilers have warnings against that. For example, one would get two warnings for code like this:

Code: Select all

while (pointer = getNext()) ;
Both of which could be suppressed by making the intent clearer:

Code: Select all

while ((pointer = getNext()) != NULL) { }
The postincrement operator having undefined meaning when there is more than one use of that variable.
Then why not put your increments on another line?
Bregalad wrote:To return two values, return a long and bit pack both values in the result.
Not if the values you want to return won't fit in a long. For example, a pointer typically takes up a whole long (sizeof(intptr_t) >= sizeof(long)). Passing a pointer to a struct is far more common in code that I've read, even for two results.

User avatar
thefox
Posts: 3141
Joined: Mon Jan 03, 2005 10:36 am
Location: Tampere, Finland
Contact:

Post by thefox » Thu Aug 02, 2012 1:42 pm

Nioreh wrote:If noism will make it simpler that CC65 to handle bank switching, it can be really useful. I don't have much experience working with bank switching, but in CC65 you basically have to keep your code small enough to fit in one 16K bank, and use the other one for pure data.
It should be fine to have code in switchable banks in CC65, just make sure the library routines (like stack manipulation) are in the fixed bank. This can be achieved by naming the fixed bank/segment "CODE". Of course you have to manually make sure the correct functions are mapped in the non-fixed bank whenever calling them. :)

strat
Posts: 364
Joined: Mon Apr 07, 2008 6:08 pm
Location: Missouri

Re: Spec for HLL targeting NES

Post by strat » Thu Aug 02, 2012 9:34 pm

Most C platforms I know of with 16-bit int have 32-bit long. You could use the <stdint.h> names int16_t, uint16_t, int32_t, and uint32_t for variables that stay 16-bit or 32-bit regardless of platform.
Thinking it over real quick, maybe it's best to adopt the GBA syntax as default: s8, ... s64. The 's' is supposed to stand for 'signed' but it might as well be 'storage', since cpus afaik don't distinguish between signed and unsigned, only the 'printf' function.
Yay! No more DRY violations that are characteristic of ports to some platforms. And if you provide a standard C back end, the result becomes more easily portable to Windows, Mac OS X, desktop Linux, iOS, and Android.
Hmmm... having read your XNA article, my best interpretation of this idea is that Noism compiles into C code. That's not really a bad idea. Then you'd have one code base that will create a real NES game and a retro-style game for highend systems. Too bad we didn't have this discussion while Megaman 9 was being developed.

tepples
Posts: 22019
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Spec for HLL targeting NES

Post by tepples » Thu Aug 02, 2012 9:53 pm

strat wrote:Thinking it over real quick, maybe it's best to adopt the GBA [integer type names] as default: s8, ... s64. The 's' is supposed to stand for 'signed' but it might as well be 'storage', since cpus afaik don't distinguish between signed and unsigned, only the 'printf' function.
An 8*8=16 bit multiply, or 16*16=32, or 32*32=64 sure does. So do the operators /, <, and >.

User avatar
qbradq
Posts: 951
Joined: Wed Oct 15, 2008 11:50 am

Re: Spec for HLL targeting NES

Post by qbradq » Mon Oct 15, 2012 6:43 am

I'm glad to see my previous work here is still being read :D I'll be interested to see what you come up with! I did get mine producing assembly code, but did not pursue it much beyond that.

What I discovered in my toying around (and never really reported back on) was that the HLL I had designed simply could not produce machine code that was as efficient (or even close) to the code I would write by hand. This was due to the fact that the HLL and the machine were engineered for different patterns.

So, if your goal is to produce machine code that is as efficient or very close to the assembly you would write by hand, you need to identify the patterns you are using while writing assembly and then base the requirements of the HLL on those patterns.

If you want a common code base for multiple platforms then you're best bet is to use a small set of basic patterns to base your HLL on, then translate those into machine instructions for the target platform that may not necessarily be very efficient.

I think trying to achieve both is not terribly productive on these early microprocessor architectures. These things (the 65xx and Z80 series MC's) were specifically engineered to be programmed in their machine language. Other architectures (like the Intel 80 series and later Motorola 68K series) were designed with HLL's in mind, and efficiently implement some of these HLL patterns in hardware.

3gengames
Formerly 65024U
Posts: 2276
Joined: Sat Mar 27, 2010 12:57 pm

Re:

Post by 3gengames » Mon Oct 15, 2012 7:19 am

Bregalad wrote:
Dwedit wrote: Lack of forward declarations is the single most annoying part of C and C++. I write a function, then need to copy-paste the first line somewhere else just so I can call it in code that happens to be before the function. That is absolutely ridiculous.
I agree with you on this one !
Me too! I hate that crap, it's so unneeded. I also hate how there's no real way to include data for the binary to use at runtime, you have to load it from a file and stick it in an array or something. I hate how modern languages work in general honestly. C isn't too bad, but still, could be much better.

Post Reply