How careful are you about code size?

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

User avatar
bleubleu
Posts: 108
Joined: Wed Apr 04, 2018 7:29 pm
Location: Montreal, Canada

How careful are you about code size?

Post by bleubleu »

Hi.

The NES is obviously a very limited system, but the one resource i pay the least attention to is the code size. I think i am reaching 6K of code soon and I basically just have a guy running around in a room.

Do you guys care at all about code size, do you you just add extra pages of ROM whenever you run out?

What do you allow yourself to use macros for (16bit math, etc.) ? Do you often unroll loops? Any best practices I should be aware of?

Thanks.

-Mat
User avatar
Dwedit
Posts: 4922
Joined: Fri Nov 19, 2004 7:35 pm
Contact:

Re: How careful are you about code size?

Post by Dwedit »

When I did homebrew for the TI83, code size was paramount and extremely important. Especially when you had 25K of user ram for all your programs.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: How careful are you about code size?

Post by tepples »

When to add pages

With very few exceptions, sizes of PRG ROM and CHR ROM on NES are powers of two. When you add more pages of ROM, you always double it, and if you double it too many times, you could exceed how much you had planned to pay per cartridge for replication. A few of these thresholds are especially painful.

For PRG ROM size:
  • 32K to more than 32K, as you now have to include PRG bank switching hardware on the PCB and decide what goes in the fixed bank and what goes elsewhere.
  • 64K to more than 64K, as you're now ineligible for the main category of the NESdev Compo (and thus exposure on the Action 53 anthologies to build an audience for your future solo cartridge releases).
  • 256K to more than 256K, as many popular discrete mappers top out at that much, as does the MMC1 if CHR ROM is used.
  • 512K to more than 512K, as you hit the limit of the PowerPak and many ASIC mappers.
For sizes of other things:
  • 8K CHR ROM to more than 8K, as you now have to include CHR bank switching hardware on the PCB and figure out what will be displayed alongside what.
  • Roughly 6K of DPCM samples to bigger than 6K, as you're now putting serious pressure on your fixed bank.
  • 16K of enemy movement code to more than 16K, as you now have to split that across several banks and give each enemy type not only a movement routine entry point but also a movement routine bank.
  • 16K of music code and data to more than 16K, as you now have to move a lot of music code into the fixed bank so that it can access sequence data in multiple banks.
  • 2K of RAM to more than 2K, as you need to include WRAM and decoding hardware on the PCB.
  • 32 bits of state preserved from one play session of your campaign to the next to more than 32, as you need to switch from an 8-character password to battery-backed RAM or self-flashability in order to save players' sanity.
  • 8K of WRAM to more than 8K, as only a few well-known mappers are known to support that: MMC1 with CHR RAM, MMC5, and FME-7.
The PRG ROM of Super Mario Bros. was just over 32K. As ShaneM discovered, there are a few parts of the program where tricky code-golf optimizations were made, and a few parts that were left unoptimized. Nintendo engineers optimized the code for size just enough to the point where it came under 32K.

Time-space tradeoffs

Moderate unrolling (factors of 4 to 16 or so) is also beneficial in time-critical code, such as video memory update routines. If you're trying to fit into 16K PRG ROM for NROM-128, you probably don't have that much stuff to push to video memory each frame, so you can get away with a less unrolled update loop than something that pushes the limits of video memory bandwidth the way Battletoads does.

Subroutine call overhead on 6502 is 12 cycles: 6 for JSR and 6 for RTS. If a routine is called only a few times per 29780-cycle frame, this overhead may not amount to much.
User avatar
Kasumi
Posts: 1293
Joined: Wed Apr 02, 2008 2:09 pm

Re: How careful are you about code size?

Post by Kasumi »

It's also the resource I pay the least attention too. Making my code smaller affects my players' perception of the game less than making my code faster (dropped frames vs... an extra second of download on 56K?). (Before the pedantic, I deal in ROMs, making cartridge costs aren't a factor.)

I do turn a lot of jumps into branches (which can only tie or make code slower) for bytes. (There's usually always a flag you can branch on instead of a jmp.) I do end up caring a little bit about data size, but I just find that sort of compression fun.

I use macros to unroll loops, and in place of subroutines that are used often (to avoid the jsr/rts speed hit.) I don't use 'em too much for 16bit math since usually it makes optimizations harder to see. (If you have a 16bit add macro with a clc baked in, but you know the carry will be clear in some places where you place it. Or you know it will be set, and can just fix the constant. Or whatever.)

If you posted some code, it'd be easier to give tips based on what you're actually doing. My most general tip is the carry flag is super useful for all kinds of optimizations. Relatively few instructions change it so you can rely on it not changing (and branch based on a value it has had) for a while. As well you can set things up so you branch on carry set to a subtraction (so no need to sec) and on carry clear it wouldn't branch and would add (so no need to clc.) It's two cycles to set or clear and again... doesn't really change. So you can use it as a return value for a subroutine, and do a lot of other stuff before you actually use the value.

You can also check out this thread: http://atariage.com/forums/topic/71120- ... ler-hacks/
And this one: http://www.atariage.com/forums/topic/11 ... -by-seven/
And this wiki article: https://wiki.nesdev.com/w/index.php/Syn ... structions (Reverse subtract is nice.)
As well as here: http://codebase64.org/doku.php?id=base:6502_6510_maths
For some cute examples of code to get your brain turning.

tl;dr: Worry the most about getting the game done, honestly. Use whatever resources are at your disposal for that, because none of the rest matters if no one will ever play it.
kuja killer
Posts: 130
Joined: Mon May 25, 2009 2:20 pm

Re: How careful are you about code size?

Post by kuja killer »

me personally, i always try to use the "least" amount of code as possible for anything on megaman odyssey. I like to say that im the master of "optimzing routines" - cause i've been doing it for so many years.
The game rarely ever has any lag frames "during" gameplay, and i've tested certain areas and situations with fceux movie-recordings and the lag counter, countless times.
except for Pyro Man level only, cause the SNES-Genesis style water IRQ is called like 30 times a frame

But im always so obsessed with finding or creating as many little shortcuts as possible even though i know it wont ever matter to a person in the world.

My game's size is currently 512 KB graphics, and 512 KB coding ...i have more than 50% free space on graphics, and probably about 3 or 4 "2000 byte banks" still avaiable on coding space, before i must upgrade to 1 MB

I use the MMC5 mapper, so the limit is 1 MB on both. never had to go past 512 kb yet.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: How careful are you about code size?

Post by tokumaru »

I normally consider PRG-ROM to be a cheap resource, so I'll often resort to using unrolled loops and look-up tables if that results in performance improvements. However, that doesn't mean I don't care about code size, because even though 512KB of PRG-ROM (a "common" limit for programs that are not meant to be part of compilations) is a lot to fill, the NES can still only see 32KB at a time, and I really don't want to be switching banks back and forth during critical parts of my game engine.

Like other coders here, I do a lot of small optimizations, turning jumps into branches, removing redundant SECs and CLCs, and so on. I always try to draw attention to these optimizations using comments, so that I know I must be careful when editing code around them. Another thing I do is try to use as few variables as possible in any given logic block, to avoid excessive loading and storing. I also spend some time optimizing loops, avoiding special handling of the first or last iterations (which could require repeated code) and finding the optimal ending conditions (e.g. counting down instead of up to avoid a comparison), even if that means tweaking the inner logic a bit.
User avatar
NovaSquirrel
Posts: 483
Joined: Fri Feb 27, 2009 2:35 pm
Location: Fort Wayne, Indiana
Contact:

Re: How careful are you about code size?

Post by NovaSquirrel »

tokumaru wrote:I really don't want to be switching banks back and forth during critical parts of my game engine.
This is really the main driver behind trying to keep code size down, myself. If I can keep all of the code and data that's relevant to each other inside the same bank, that means that everything is a lot simpler. MMC1 also makes you want to avoid bank switching when you can because of how many cycles it ends up using.

On top of doing small optimizations, I tend to use subroutines quite heavily. It doesn't matter if the code is very specific - if there's a block of code I use multiple times and cycles aren't very important, I make it a subroutine.

Another thing I do with subroutines a lot is to give them a "default" input, that can be overridden. For example, I have a "Display Enemy" which just takes a starting tile number, which is the most common case. "Display Enemy" just prepares an input for and drops into executing "Display Enemy Custom" which takes four arbitrary tile numbers, and I only have to store one copy of the pretty lengthy subroutine. Super Mario Bros does something similar, giving some routines multiple places to enter them that feed something else with different data depending on which entry point you used.
User avatar
Bregalad
Posts: 8055
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: How careful are you about code size?

Post by Bregalad »

Personally I am very careful about code size, and I know I disagreed with tokumaru on that quite a few times :)
Optimizing for speed only makes sense if you come close to the limit where you use 100% of the CPU, while optimizing for size makes sense whenever you are close to a limit of 2^n of PRG-ROM and don't want to go for the upper 2^n size.

Limiting code size also makes the most sense if you want to limit to 32KB and avoid bankswitching altogether. (Yes, there is way to get more without bankswitching but they're non-cannonical). As soon as you'll implement any bankswitching I don't think it makes much difference whether you use, say, 64KB or 128KB, but each mapper has it's cannonical limits.
Garth
Posts: 246
Joined: Wed Nov 30, 2016 4:45 pm
Location: Southern California
Contact:

Re: How careful are you about code size?

Post by Garth »

bleubleu wrote:What do you allow yourself to use macros for (16bit math, etc.) ?
In most cases, macros should assemble exactly the same thing you would have written out by hand, adding no extra length of executable code nor execution time, only hiding the ugly internal details so you don't have to look at them every time you use them. Since they can make your code more concise and readable, you might even find optimization possibilities that weren't otherwise obvious. I also get fewer bugs when I make heavy use of macros, and the ones I do get tend to be easier to find and fix. Call me the macro junkie. Maybe that should have been my forum name.

Kasumi has a valid point about a few situations like whether or not a CLC is needed before a 16-bit add in a macro; but you can also have for example two different macros that do the same thing except that one has the leading CLC and the other does not, and give them the same names except that the one less used might have a trailing _ or something like that. That's not very common though, and of course just because you have a macro there doesn't mean you can't still do it the non-macro way if you want to.

I do 6502 nestable program flow control structures in macros too. See http://wilsonminesco.com/StructureMacros/ . One of the simplest examples might be

Code: Select all

        CMP  #14
        IF_EQ            ; clear enough that it really needs no comments
           <actions>
           <actions>
           <actions>
        END_IF
and the IF_EQ assembles aBNEdown to theEND_IF, exactly as you would write by hand, but it doesn't need the label. TheEND_IFis only used by the assembler, and it does not lay down any code. Again, it can be nested too, meaning a secondIF_EQ...END_IFpair could be inside the first one, and another one inside of that one, etc., and the assembler will make each branch go to the right place. There are lots of different forms of these, even in the IFs group, likeIF_BIT ACIA_STAT_REG, 3, IS_SET, and of course other structures besidesIFs, like BEGIN...WHILE...REPEAT, FOR...NEXT (including a 16-bit one), CASE, etc..
http://WilsonMinesCo.com/ lots of 6502 resources
Oziphantom
Posts: 1565
Joined: Tue Feb 07, 2017 2:03 am

Re: How careful are you about code size?

Post by Oziphantom »

Code size is something to be careful about, it does sneak up on.
At the start I just write code, I don't care about how big or slow it is. I just get it to do "the thing". Once I see the thing and I play with it, I can get it to the point I'm convinced that it will "stay". Then I will make it "sensible". Any thing that can be looped, tabled etc get looped and tabled. I then save size/speed optimisations for when it becomes critical and I can see how it has to work in the mostly complete code.

For, this "clc is not needed", "this jump can be a bne etc", I leave it to the tass optimizer to find those, it will find all of them in a second, it does love to show off.

In order to stop the 3 weeks of optimisation at the end of the project that is risky as you don't have time to test everything, I use BDD6502 and I take a week or so here and there to do a optimise and code cleaning pass, to break it up a bit.

Macros are nice, but they are mermaids.. they sing a sweet song and send you to your doom if you are not careful. If you make the "safe" its mostly ok. But you have to really plan to properly and understand how they work etc. I did have a lot of macros but I found they tend to make the code less readable and maintainable after a while. ADCB_W, ADCBX_W, IFBLT, IFBLTE, BAGTE etc. So I've evolved it to a syntax sugar + optimizer system which then gets the tass optimizer as well. After using it for the last 4 months everyday its too the point I can't even be bothered to write a small test case the old way.. Still has a lot of things it doesn't do that it needs to do though sigh...
kuja killer
Posts: 130
Joined: Mon May 25, 2009 2:20 pm

Re: How careful are you about code size?

Post by kuja killer »

just wanted to show this if i may, sometimes i'll have insanely long sections of nothing but JSR's only ..to do many different things in terms of enemy/boss movement patterns, like this :P

Image

without all these JSR's, it would have been like 20x longer.
fun discussion topic :)
psycopathicteen
Posts: 3140
Joined: Wed May 19, 2010 6:12 pm

Re: How careful are you about code size?

Post by psycopathicteen »

How many bank granularity is there with existing mappers? If I start with a 32kB bank, things will go smoothly, but if I pass that limit and have to rely on bank switching to get around it, trying to fit existing code in seperate 16kB banks would be really frustrating.
tepples
Posts: 22705
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: How careful are you about code size?

Post by tepples »

The 3 most common layouts:

32K (e.g. GNROM, AOROM, BNROM, Color Dreams, MMC1)
$8000-$FFFF: 32K switchable window

16K (e.g. UNROM, MMC1)
$8000-$BFFF: 16K switchable window
$C000-$FFFF: fixed to last 16K

2x8K (e.g. MMC3)
$8000-$9FFF: 8K switchable window
$A000-$BFFF: 8K switchable window
$C000-$FFFF: fixed to last 16K

And specialized layouts:

2x8K reconfigured for DPCM switching (e.g. MMC3, VRC4)
$8000-$9FFF: fixed to last 8K
$A000-$BFFF: 8K switchable window
$C000-$DFFF: 8K switchable window
$E000-$FFFF: fixed to last 8K

16K+8K (e.g. VRC6)
$8000-$9FFF: fixed to last 8K
$A000-$BFFF: 8K switchable window
$C000-$DFFF: 8K switchable window
$E000-$FFFF: fixed to last 8K

3x8K (e.g. FME-7, RAMBO-1)
$8000-$9FFF: 8K switchable window
$A000-$BFFF: 8K switchable window
$C000-$DFFF: 8K switchable window
$E000-$FFFF: fixed to last 8K

One reason for having a fixed bank is that on NES, the program, data, and DPCM sample bank are all $00 (in 65816 terms). So the mapper needs to subdivide the space so that a subroutine in one bank can access data in another bank.
Garth
Posts: 246
Joined: Wed Nov 30, 2016 4:45 pm
Location: Southern California
Contact:

Re: How careful are you about code size?

Post by Garth »

Oziphantom wrote:Macros are nice, but they are mermaids.. they sing a sweet song and send you to your doom if you are not careful. If you make the "safe" its mostly ok. But you have to really plan to properly and understand how they work etc. I did have a lot of macros but I found they tend to make the code less readable and maintainable after a while. ADCB_W, ADCBX_W, IFBLT, IFBLTE, BAGTE etc.
Take a different approach. Instead of using cryptic names, make it really clear what they're doing, and use the parameters to make like a sentence. If your ADCB_W means "Do a double-precision (16-bit) add-with-carry of B and W," you could change the macro name to something like _16bit_ADC, and make the line say for example,

Code: Select all

        _16bit_ADC   B, _and, W    ; B=B+W
(Unfortunately the assembler requires separating parameters with a comma, which is why there's a comma after the _and.) The "_and" (with the underscore or other character to keep the assembler from confusing it with the mnemonic) is an equate that does not actually get used by the macro. It's only there to make things more readable to humans. The comment clarifies where the answer goes. So this would assemble the same as

Code: Select all

        CLC
        LDA   B
        ADC   W
        STA   B
        LDA   B+1
        ADC   W+1
        STA   B+1
The same macro can be used to add different variables which you specify in the parameters, rather than being confined to B and W. Conditional assembly in the macro definition can do optimizations if necessary. Some assemblers let you say in essence, "If there's a fourth parameter, do the following;" so you could use the same macro to add more than just two numbers, and you could invoke it something like this:

Code: Select all

        _16bit_ADC   B, W, _and, offset3    ; B=B+W+offset3
If your IFBLT means "if: branch if less than," and only assembles a BMI, it's not really clarifying or shortening anything. How about something like this instead, where a portion is skipped if the N flag is set:

Code: Select all

        IF_POSITIVE    ; Negative result above causes it to skip the following lines.
           <do_stuff>
           <do_stuff>
           <do_stuff>
        END_IF
or to branch back to the beginning of a loop as long as the result is negative:

Code: Select all

        BEGIN           ; (Or name it "DO" if you like)
           <do_stuff>
           <do_stuff>
           <do_stuff>
        UNTIL_POSITIVE
Then you don't even need a label (although you can still use one if you want to).
http://WilsonMinesCo.com/ lots of 6502 resources
User avatar
bleubleu
Posts: 108
Joined: Wed Apr 04, 2018 7:29 pm
Location: Montreal, Canada

Re: How careful are you about code size?

Post by bleubleu »

Thanks for all the answers. The IF_XXX macros are very interesting. I also find beq/bmi/bpl are often hard to follow.

I should have realized that such an opened question would send the discussion in many direction. Ill try to be more specific.

Where do you draw the line for inlining something (macros) vs. making it a subroutine. For example, what about a 16-bit addition (which is ~20 cycles or so). Do you accept to pay 12 cycles (jsr/rts) for it just for the sake of reducing code size or do you use a macro? Where is the cutoff?

-Mat
Post Reply