How careful are you about code size?
Moderator: Moderators
How careful are you about code size?
Hi.
The NES is obviously a very limited system, but the one resource i pay the least attention to is the code size. I think i am reaching 6K of code soon and I basically just have a guy running around in a room.
Do you guys care at all about code size, do you you just add extra pages of ROM whenever you run out?
What do you allow yourself to use macros for (16bit math, etc.) ? Do you often unroll loops? Any best practices I should be aware of?
Thanks.
-Mat
The NES is obviously a very limited system, but the one resource i pay the least attention to is the code size. I think i am reaching 6K of code soon and I basically just have a guy running around in a room.
Do you guys care at all about code size, do you you just add extra pages of ROM whenever you run out?
What do you allow yourself to use macros for (16bit math, etc.) ? Do you often unroll loops? Any best practices I should be aware of?
Thanks.
-Mat
Re: How careful are you about code size?
When I did homebrew for the TI83, code size was paramount and extremely important. Especially when you had 25K of user ram for all your programs.
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!
Re: How careful are you about code size?
When to add pages
With very few exceptions, sizes of PRG ROM and CHR ROM on NES are powers of two. When you add more pages of ROM, you always double it, and if you double it too many times, you could exceed how much you had planned to pay per cartridge for replication. A few of these thresholds are especially painful.
For PRG ROM size:
Time-space tradeoffs
Moderate unrolling (factors of 4 to 16 or so) is also beneficial in time-critical code, such as video memory update routines. If you're trying to fit into 16K PRG ROM for NROM-128, you probably don't have that much stuff to push to video memory each frame, so you can get away with a less unrolled update loop than something that pushes the limits of video memory bandwidth the way Battletoads does.
Subroutine call overhead on 6502 is 12 cycles: 6 for JSR and 6 for RTS. If a routine is called only a few times per 29780-cycle frame, this overhead may not amount to much.
With very few exceptions, sizes of PRG ROM and CHR ROM on NES are powers of two. When you add more pages of ROM, you always double it, and if you double it too many times, you could exceed how much you had planned to pay per cartridge for replication. A few of these thresholds are especially painful.
For PRG ROM size:
- 32K to more than 32K, as you now have to include PRG bank switching hardware on the PCB and decide what goes in the fixed bank and what goes elsewhere.
- 64K to more than 64K, as you're now ineligible for the main category of the NESdev Compo (and thus exposure on the Action 53 anthologies to build an audience for your future solo cartridge releases).
- 256K to more than 256K, as many popular discrete mappers top out at that much, as does the MMC1 if CHR ROM is used.
- 512K to more than 512K, as you hit the limit of the PowerPak and many ASIC mappers.
- 8K CHR ROM to more than 8K, as you now have to include CHR bank switching hardware on the PCB and figure out what will be displayed alongside what.
- Roughly 6K of DPCM samples to bigger than 6K, as you're now putting serious pressure on your fixed bank.
- 16K of enemy movement code to more than 16K, as you now have to split that across several banks and give each enemy type not only a movement routine entry point but also a movement routine bank.
- 16K of music code and data to more than 16K, as you now have to move a lot of music code into the fixed bank so that it can access sequence data in multiple banks.
- 2K of RAM to more than 2K, as you need to include WRAM and decoding hardware on the PCB.
- 32 bits of state preserved from one play session of your campaign to the next to more than 32, as you need to switch from an 8-character password to battery-backed RAM or self-flashability in order to save players' sanity.
- 8K of WRAM to more than 8K, as only a few well-known mappers are known to support that: MMC1 with CHR RAM, MMC5, and FME-7.
Time-space tradeoffs
Moderate unrolling (factors of 4 to 16 or so) is also beneficial in time-critical code, such as video memory update routines. If you're trying to fit into 16K PRG ROM for NROM-128, you probably don't have that much stuff to push to video memory each frame, so you can get away with a less unrolled update loop than something that pushes the limits of video memory bandwidth the way Battletoads does.
Subroutine call overhead on 6502 is 12 cycles: 6 for JSR and 6 for RTS. If a routine is called only a few times per 29780-cycle frame, this overhead may not amount to much.
Re: How careful are you about code size?
It's also the resource I pay the least attention too. Making my code smaller affects my players' perception of the game less than making my code faster (dropped frames vs... an extra second of download on 56K?). (Before the pedantic, I deal in ROMs, making cartridge costs aren't a factor.)
I do turn a lot of jumps into branches (which can only tie or make code slower) for bytes. (There's usually always a flag you can branch on instead of a jmp.) I do end up caring a little bit about data size, but I just find that sort of compression fun.
I use macros to unroll loops, and in place of subroutines that are used often (to avoid the jsr/rts speed hit.) I don't use 'em too much for 16bit math since usually it makes optimizations harder to see. (If you have a 16bit add macro with a clc baked in, but you know the carry will be clear in some places where you place it. Or you know it will be set, and can just fix the constant. Or whatever.)
If you posted some code, it'd be easier to give tips based on what you're actually doing. My most general tip is the carry flag is super useful for all kinds of optimizations. Relatively few instructions change it so you can rely on it not changing (and branch based on a value it has had) for a while. As well you can set things up so you branch on carry set to a subtraction (so no need to sec) and on carry clear it wouldn't branch and would add (so no need to clc.) It's two cycles to set or clear and again... doesn't really change. So you can use it as a return value for a subroutine, and do a lot of other stuff before you actually use the value.
You can also check out this thread: http://atariage.com/forums/topic/71120- ... ler-hacks/
And this one: http://www.atariage.com/forums/topic/11 ... -by-seven/
And this wiki article: https://wiki.nesdev.com/w/index.php/Syn ... structions (Reverse subtract is nice.)
As well as here: http://codebase64.org/doku.php?id=base:6502_6510_maths
For some cute examples of code to get your brain turning.
tl;dr: Worry the most about getting the game done, honestly. Use whatever resources are at your disposal for that, because none of the rest matters if no one will ever play it.
I do turn a lot of jumps into branches (which can only tie or make code slower) for bytes. (There's usually always a flag you can branch on instead of a jmp.) I do end up caring a little bit about data size, but I just find that sort of compression fun.
I use macros to unroll loops, and in place of subroutines that are used often (to avoid the jsr/rts speed hit.) I don't use 'em too much for 16bit math since usually it makes optimizations harder to see. (If you have a 16bit add macro with a clc baked in, but you know the carry will be clear in some places where you place it. Or you know it will be set, and can just fix the constant. Or whatever.)
If you posted some code, it'd be easier to give tips based on what you're actually doing. My most general tip is the carry flag is super useful for all kinds of optimizations. Relatively few instructions change it so you can rely on it not changing (and branch based on a value it has had) for a while. As well you can set things up so you branch on carry set to a subtraction (so no need to sec) and on carry clear it wouldn't branch and would add (so no need to clc.) It's two cycles to set or clear and again... doesn't really change. So you can use it as a return value for a subroutine, and do a lot of other stuff before you actually use the value.
You can also check out this thread: http://atariage.com/forums/topic/71120- ... ler-hacks/
And this one: http://www.atariage.com/forums/topic/11 ... -by-seven/
And this wiki article: https://wiki.nesdev.com/w/index.php/Syn ... structions (Reverse subtract is nice.)
As well as here: http://codebase64.org/doku.php?id=base:6502_6510_maths
For some cute examples of code to get your brain turning.
tl;dr: Worry the most about getting the game done, honestly. Use whatever resources are at your disposal for that, because none of the rest matters if no one will ever play it.
-
- Posts: 130
- Joined: Mon May 25, 2009 2:20 pm
Re: How careful are you about code size?
me personally, i always try to use the "least" amount of code as possible for anything on megaman odyssey. I like to say that im the master of "optimzing routines" - cause i've been doing it for so many years.
The game rarely ever has any lag frames "during" gameplay, and i've tested certain areas and situations with fceux movie-recordings and the lag counter, countless times.
except for Pyro Man level only, cause the SNES-Genesis style water IRQ is called like 30 times a frame
But im always so obsessed with finding or creating as many little shortcuts as possible even though i know it wont ever matter to a person in the world.
My game's size is currently 512 KB graphics, and 512 KB coding ...i have more than 50% free space on graphics, and probably about 3 or 4 "2000 byte banks" still avaiable on coding space, before i must upgrade to 1 MB
I use the MMC5 mapper, so the limit is 1 MB on both. never had to go past 512 kb yet.
The game rarely ever has any lag frames "during" gameplay, and i've tested certain areas and situations with fceux movie-recordings and the lag counter, countless times.
except for Pyro Man level only, cause the SNES-Genesis style water IRQ is called like 30 times a frame
But im always so obsessed with finding or creating as many little shortcuts as possible even though i know it wont ever matter to a person in the world.
My game's size is currently 512 KB graphics, and 512 KB coding ...i have more than 50% free space on graphics, and probably about 3 or 4 "2000 byte banks" still avaiable on coding space, before i must upgrade to 1 MB
I use the MMC5 mapper, so the limit is 1 MB on both. never had to go past 512 kb yet.
Re: How careful are you about code size?
I normally consider PRG-ROM to be a cheap resource, so I'll often resort to using unrolled loops and look-up tables if that results in performance improvements. However, that doesn't mean I don't care about code size, because even though 512KB of PRG-ROM (a "common" limit for programs that are not meant to be part of compilations) is a lot to fill, the NES can still only see 32KB at a time, and I really don't want to be switching banks back and forth during critical parts of my game engine.
Like other coders here, I do a lot of small optimizations, turning jumps into branches, removing redundant SECs and CLCs, and so on. I always try to draw attention to these optimizations using comments, so that I know I must be careful when editing code around them. Another thing I do is try to use as few variables as possible in any given logic block, to avoid excessive loading and storing. I also spend some time optimizing loops, avoiding special handling of the first or last iterations (which could require repeated code) and finding the optimal ending conditions (e.g. counting down instead of up to avoid a comparison), even if that means tweaking the inner logic a bit.
Like other coders here, I do a lot of small optimizations, turning jumps into branches, removing redundant SECs and CLCs, and so on. I always try to draw attention to these optimizations using comments, so that I know I must be careful when editing code around them. Another thing I do is try to use as few variables as possible in any given logic block, to avoid excessive loading and storing. I also spend some time optimizing loops, avoiding special handling of the first or last iterations (which could require repeated code) and finding the optimal ending conditions (e.g. counting down instead of up to avoid a comparison), even if that means tweaking the inner logic a bit.
- NovaSquirrel
- Posts: 483
- Joined: Fri Feb 27, 2009 2:35 pm
- Location: Fort Wayne, Indiana
- Contact:
Re: How careful are you about code size?
This is really the main driver behind trying to keep code size down, myself. If I can keep all of the code and data that's relevant to each other inside the same bank, that means that everything is a lot simpler. MMC1 also makes you want to avoid bank switching when you can because of how many cycles it ends up using.tokumaru wrote:I really don't want to be switching banks back and forth during critical parts of my game engine.
On top of doing small optimizations, I tend to use subroutines quite heavily. It doesn't matter if the code is very specific - if there's a block of code I use multiple times and cycles aren't very important, I make it a subroutine.
Another thing I do with subroutines a lot is to give them a "default" input, that can be overridden. For example, I have a "Display Enemy" which just takes a starting tile number, which is the most common case. "Display Enemy" just prepares an input for and drops into executing "Display Enemy Custom" which takes four arbitrary tile numbers, and I only have to store one copy of the pretty lengthy subroutine. Super Mario Bros does something similar, giving some routines multiple places to enter them that feed something else with different data depending on which entry point you used.
Re: How careful are you about code size?
Personally I am very careful about code size, and I know I disagreed with tokumaru on that quite a few times
Optimizing for speed only makes sense if you come close to the limit where you use 100% of the CPU, while optimizing for size makes sense whenever you are close to a limit of 2^n of PRG-ROM and don't want to go for the upper 2^n size.
Limiting code size also makes the most sense if you want to limit to 32KB and avoid bankswitching altogether. (Yes, there is way to get more without bankswitching but they're non-cannonical). As soon as you'll implement any bankswitching I don't think it makes much difference whether you use, say, 64KB or 128KB, but each mapper has it's cannonical limits.
Optimizing for speed only makes sense if you come close to the limit where you use 100% of the CPU, while optimizing for size makes sense whenever you are close to a limit of 2^n of PRG-ROM and don't want to go for the upper 2^n size.
Limiting code size also makes the most sense if you want to limit to 32KB and avoid bankswitching altogether. (Yes, there is way to get more without bankswitching but they're non-cannonical). As soon as you'll implement any bankswitching I don't think it makes much difference whether you use, say, 64KB or 128KB, but each mapper has it's cannonical limits.
Re: How careful are you about code size?
In most cases, macros should assemble exactly the same thing you would have written out by hand, adding no extra length of executable code nor execution time, only hiding the ugly internal details so you don't have to look at them every time you use them. Since they can make your code more concise and readable, you might even find optimization possibilities that weren't otherwise obvious. I also get fewer bugs when I make heavy use of macros, and the ones I do get tend to be easier to find and fix. Call me the macro junkie. Maybe that should have been my forum name.bleubleu wrote:What do you allow yourself to use macros for (16bit math, etc.) ?
Kasumi has a valid point about a few situations like whether or not a CLC is needed before a 16-bit add in a macro; but you can also have for example two different macros that do the same thing except that one has the leading CLC and the other does not, and give them the same names except that the one less used might have a trailing _ or something like that. That's not very common though, and of course just because you have a macro there doesn't mean you can't still do it the non-macro way if you want to.
I do 6502 nestable program flow control structures in macros too. See http://wilsonminesco.com/StructureMacros/ . One of the simplest examples might be
Code: Select all
CMP #14
IF_EQ ; clear enough that it really needs no comments
<actions>
<actions>
<actions>
END_IF
http://WilsonMinesCo.com/ lots of 6502 resources
-
- Posts: 1565
- Joined: Tue Feb 07, 2017 2:03 am
Re: How careful are you about code size?
Code size is something to be careful about, it does sneak up on.
At the start I just write code, I don't care about how big or slow it is. I just get it to do "the thing". Once I see the thing and I play with it, I can get it to the point I'm convinced that it will "stay". Then I will make it "sensible". Any thing that can be looped, tabled etc get looped and tabled. I then save size/speed optimisations for when it becomes critical and I can see how it has to work in the mostly complete code.
For, this "clc is not needed", "this jump can be a bne etc", I leave it to the tass optimizer to find those, it will find all of them in a second, it does love to show off.
In order to stop the 3 weeks of optimisation at the end of the project that is risky as you don't have time to test everything, I use BDD6502 and I take a week or so here and there to do a optimise and code cleaning pass, to break it up a bit.
Macros are nice, but they are mermaids.. they sing a sweet song and send you to your doom if you are not careful. If you make the "safe" its mostly ok. But you have to really plan to properly and understand how they work etc. I did have a lot of macros but I found they tend to make the code less readable and maintainable after a while. ADCB_W, ADCBX_W, IFBLT, IFBLTE, BAGTE etc. So I've evolved it to a syntax sugar + optimizer system which then gets the tass optimizer as well. After using it for the last 4 months everyday its too the point I can't even be bothered to write a small test case the old way.. Still has a lot of things it doesn't do that it needs to do though sigh...
At the start I just write code, I don't care about how big or slow it is. I just get it to do "the thing". Once I see the thing and I play with it, I can get it to the point I'm convinced that it will "stay". Then I will make it "sensible". Any thing that can be looped, tabled etc get looped and tabled. I then save size/speed optimisations for when it becomes critical and I can see how it has to work in the mostly complete code.
For, this "clc is not needed", "this jump can be a bne etc", I leave it to the tass optimizer to find those, it will find all of them in a second, it does love to show off.
In order to stop the 3 weeks of optimisation at the end of the project that is risky as you don't have time to test everything, I use BDD6502 and I take a week or so here and there to do a optimise and code cleaning pass, to break it up a bit.
Macros are nice, but they are mermaids.. they sing a sweet song and send you to your doom if you are not careful. If you make the "safe" its mostly ok. But you have to really plan to properly and understand how they work etc. I did have a lot of macros but I found they tend to make the code less readable and maintainable after a while. ADCB_W, ADCBX_W, IFBLT, IFBLTE, BAGTE etc. So I've evolved it to a syntax sugar + optimizer system which then gets the tass optimizer as well. After using it for the last 4 months everyday its too the point I can't even be bothered to write a small test case the old way.. Still has a lot of things it doesn't do that it needs to do though sigh...
-
- Posts: 130
- Joined: Mon May 25, 2009 2:20 pm
Re: How careful are you about code size?
just wanted to show this if i may, sometimes i'll have insanely long sections of nothing but JSR's only ..to do many different things in terms of enemy/boss movement patterns, like this
without all these JSR's, it would have been like 20x longer.
fun discussion topic
without all these JSR's, it would have been like 20x longer.
fun discussion topic
-
- Posts: 3140
- Joined: Wed May 19, 2010 6:12 pm
Re: How careful are you about code size?
How many bank granularity is there with existing mappers? If I start with a 32kB bank, things will go smoothly, but if I pass that limit and have to rely on bank switching to get around it, trying to fit existing code in seperate 16kB banks would be really frustrating.
Re: How careful are you about code size?
The 3 most common layouts:
32K (e.g. GNROM, AOROM, BNROM, Color Dreams, MMC1)
$8000-$FFFF: 32K switchable window
16K (e.g. UNROM, MMC1)
$8000-$BFFF: 16K switchable window
$C000-$FFFF: fixed to last 16K
2x8K (e.g. MMC3)
$8000-$9FFF: 8K switchable window
$A000-$BFFF: 8K switchable window
$C000-$FFFF: fixed to last 16K
And specialized layouts:
2x8K reconfigured for DPCM switching (e.g. MMC3, VRC4)
$8000-$9FFF: fixed to last 8K
$A000-$BFFF: 8K switchable window
$C000-$DFFF: 8K switchable window
$E000-$FFFF: fixed to last 8K
16K+8K (e.g. VRC6)
$8000-$9FFF: fixed to last 8K
$A000-$BFFF: 8K switchable window
$C000-$DFFF: 8K switchable window
$E000-$FFFF: fixed to last 8K
3x8K (e.g. FME-7, RAMBO-1)
$8000-$9FFF: 8K switchable window
$A000-$BFFF: 8K switchable window
$C000-$DFFF: 8K switchable window
$E000-$FFFF: fixed to last 8K
One reason for having a fixed bank is that on NES, the program, data, and DPCM sample bank are all $00 (in 65816 terms). So the mapper needs to subdivide the space so that a subroutine in one bank can access data in another bank.
32K (e.g. GNROM, AOROM, BNROM, Color Dreams, MMC1)
$8000-$FFFF: 32K switchable window
16K (e.g. UNROM, MMC1)
$8000-$BFFF: 16K switchable window
$C000-$FFFF: fixed to last 16K
2x8K (e.g. MMC3)
$8000-$9FFF: 8K switchable window
$A000-$BFFF: 8K switchable window
$C000-$FFFF: fixed to last 16K
And specialized layouts:
2x8K reconfigured for DPCM switching (e.g. MMC3, VRC4)
$8000-$9FFF: fixed to last 8K
$A000-$BFFF: 8K switchable window
$C000-$DFFF: 8K switchable window
$E000-$FFFF: fixed to last 8K
16K+8K (e.g. VRC6)
$8000-$9FFF: fixed to last 8K
$A000-$BFFF: 8K switchable window
$C000-$DFFF: 8K switchable window
$E000-$FFFF: fixed to last 8K
3x8K (e.g. FME-7, RAMBO-1)
$8000-$9FFF: 8K switchable window
$A000-$BFFF: 8K switchable window
$C000-$DFFF: 8K switchable window
$E000-$FFFF: fixed to last 8K
One reason for having a fixed bank is that on NES, the program, data, and DPCM sample bank are all $00 (in 65816 terms). So the mapper needs to subdivide the space so that a subroutine in one bank can access data in another bank.
Re: How careful are you about code size?
Take a different approach. Instead of using cryptic names, make it really clear what they're doing, and use the parameters to make like a sentence. If your ADCB_W means "Do a double-precision (16-bit) add-with-carry of B and W," you could change the macro name to something like _16bit_ADC, and make the line say for example,Oziphantom wrote:Macros are nice, but they are mermaids.. they sing a sweet song and send you to your doom if you are not careful. If you make the "safe" its mostly ok. But you have to really plan to properly and understand how they work etc. I did have a lot of macros but I found they tend to make the code less readable and maintainable after a while. ADCB_W, ADCBX_W, IFBLT, IFBLTE, BAGTE etc.
Code: Select all
_16bit_ADC B, _and, W ; B=B+W
Code: Select all
CLC
LDA B
ADC W
STA B
LDA B+1
ADC W+1
STA B+1
Code: Select all
_16bit_ADC B, W, _and, offset3 ; B=B+W+offset3
Code: Select all
IF_POSITIVE ; Negative result above causes it to skip the following lines.
<do_stuff>
<do_stuff>
<do_stuff>
END_IF
Code: Select all
BEGIN ; (Or name it "DO" if you like)
<do_stuff>
<do_stuff>
<do_stuff>
UNTIL_POSITIVE
http://WilsonMinesCo.com/ lots of 6502 resources
Re: How careful are you about code size?
Thanks for all the answers. The IF_XXX macros are very interesting. I also find beq/bmi/bpl are often hard to follow.
I should have realized that such an opened question would send the discussion in many direction. Ill try to be more specific.
Where do you draw the line for inlining something (macros) vs. making it a subroutine. For example, what about a 16-bit addition (which is ~20 cycles or so). Do you accept to pay 12 cycles (jsr/rts) for it just for the sake of reducing code size or do you use a macro? Where is the cutoff?
-Mat
I should have realized that such an opened question would send the discussion in many direction. Ill try to be more specific.
Where do you draw the line for inlining something (macros) vs. making it a subroutine. For example, what about a 16-bit addition (which is ~20 cycles or so). Do you accept to pay 12 cycles (jsr/rts) for it just for the sake of reducing code size or do you use a macro? Where is the cutoff?
-Mat