GBA ASM
- Drew Sebastino
- Formerly Espozo
- Posts: 3496
- Joined: Mon Sep 15, 2014 4:35 pm
- Location: Richmond, Virginia
GBA ASM
Does anyone know of any good documents or tutorials for programing for the GBA using ASM? I looked on Google, but all I found was one not so helpful tutorial that didn't even appear to be finished on a website called patater.com. I also found a website called gbadev.org, but it only really has documentation on using C. The only thing I found was a GBA assembler called GoldRoad, I haven't even found a list with all the different registers, so I can't really do anything.
And out of curiosity, how many GBA games were even programed in ASM? I imagine most were written in C, as most people probably thought it is easier to write in C and hardly anything had still used ASM, so they didn't feel like learning it.
And out of curiosity, how many GBA games were even programed in ASM? I imagine most were written in C, as most people probably thought it is easier to write in C and hardly anything had still used ASM, so they didn't feel like learning it.
Re: GBA ASM
Apparently the GBA uses an ARM7 as its CPU, from the popular ARM family, so it shouldn't be so hard to find tools for it. As with any other console though, the CPU is the least of your worries... it's much more important to understand how graphics, sound and input work. Knowing the assembly language is useless if you don't know how to communicate with those other systems.
As for C vs. ASM, I think that the commitment to the platforms died after the 16-bit era. CPUs became fast enough to run games written in C, so there was no point in using assembly languages few programmers knew.
A lot of Game Boy Color games show that game companies didn't know how to work with 8/16-bit platforms anymore, because the graphics in several games didn't have any of the magic tricks from the NES days, and the gameplay was really stiff, almost like something you'd expect from chinese pirates. That was probably because of a total lack of commitment to the platform, which resulted in frustrated programmers trying to squeeze something out of ancient hardware.
As for C vs. ASM, I think that the commitment to the platforms died after the 16-bit era. CPUs became fast enough to run games written in C, so there was no point in using assembly languages few programmers knew.
A lot of Game Boy Color games show that game companies didn't know how to work with 8/16-bit platforms anymore, because the graphics in several games didn't have any of the magic tricks from the NES days, and the gameplay was really stiff, almost like something you'd expect from chinese pirates. That was probably because of a total lack of commitment to the platform, which resulted in frustrated programmers trying to squeeze something out of ancient hardware.
Re: GBA ASM
Some companies were still committed enough to write their audio mixers and soft 3D texture mappers in asm, freeing up CPU time for C game logic.
- Drew Sebastino
- Formerly Espozo
- Posts: 3496
- Joined: Mon Sep 15, 2014 4:35 pm
- Location: Richmond, Virginia
Re: GBA ASM
The problem is that I don't know any of the registers or anything, and I've looked at some code written in C, but it looks like some sort of alien language. It doesn't even look like you directly load or store registers or anything. It's really sad, but I don't even know what this code is that I found that just turns the screen red. (in the assembler I'm using, "r" oddly represents the accumulator instead of "a")
About the "hardware is fast enough to not worry about ASM" ideology, it is, of course, still always going to be more efficient if it is written in ASM no matter what. I imagine that a game with a game with as many characters, explosions, and bullets you could just about put on screen would slow down the GBA a bit if it were written in C unlike if it were written in ASM. Although It is probably a hard comparison, about how much faster is ASM than C assuming that they are both being used efficiently? One and half times as fast? Twice as fast?
Code: Select all
.text
main:
mov r0, #0x4000000
mov r1, #0x400
add r1, r1, #3
str r1, [r0]
mov r0, #0x6000000
mov r1, #0xFF
mov r2, #0x9600
loop1:
strh r1, [r0], #2
subs r2, r2, #1
bne loop1
infin:
b infin
Re: GBA ASM
C is a high-level language, so you don't deal with registers and stuff directly.Espozo wrote:The problem is that I don't know any of the registers or anything, and I've looked at some code written in C, but it looks like some sort of alien language. It doesn't even look like you directly load or store registers or anything.
I completely agree, but I'm a hobbyist who codes games for fun, while the guys that program the games you buy at the store are payed to make them. Doing things at the lowest level (assembly) requires more specialized people, more time, and, consequently, more money. Companies, which have the goal of making money, not making the best possible games, do not see any benefit in spending more money.About the "hardware is fast enough to not worry about ASM" ideology, it is, of course, still always going to be more efficient if it is written in ASM no matter what.
Re: GBA ASM
Code: Select all
mov r0, #0x4000000
Code: Select all
mov r1, #0x400
add r1, r1, #3
Code: Select all
str r1, [r0]
Code: Select all
mov r0, #0x6000000
Code: Select all
mov r1, #0xFF
Code: Select all
mov r2, #0x9600
Code: Select all
strh r1, [r0], #2
Code: Select all
subs r2, r2, #1
Code: Select all
bne loop1
Code: Select all
infin:
b infin
By that point, time efficiency for programmers had become more important than runtime efficiency. If C helped your company get the game out the door a year earlier, then you'd use C.Espozo wrote:About the "hardware is fast enough to not worry about ASM" ideology, it is, of course, still always going to be more efficient if it is written in ASM no matter what.
It depends on what brand of compiler you use. I'm told the official GBA SDK came with GCC, and over the course of the GBA's life, GCC became somewhat better at generating code for ARM7TDMI. Some companies instead sprang for proprietary $6000/seat compilers from ARM, Green Hills, etc.Although It is probably a hard comparison, about how much faster is ASM than C assuming that they are both being used efficiently? One and half times as fast? Twice as fast?
Re: GBA ASM
This was common in the mid-'90s too, use asm for very tight routines (usually anything that deals with some sort of rendering) and make everything else in C.tepples wrote:Some companies were still committed enough to write their audio mixers and soft 3D texture mappers in asm, freeing up CPU time for C game logic.
Something makes me think that you don't understand how ARM processors work...Espozo wrote:(in the assembler I'm using, "r" oddly represents the accumulator instead of "a")
- Drew Sebastino
- Formerly Espozo
- Posts: 3496
- Joined: Mon Sep 15, 2014 4:35 pm
- Location: Richmond, Virginia
Re: GBA ASM
You're completely right. To be honest with you, I've never even heard of ARM before the GBA.Sik wrote:Something makes me think that you don't understand how ARM processors work...
Oh, and tepples, thank you for figuring out the code! (In about 5 minutes, nonetheless ) One thing that really threw me off is how the registers are called using "#" instead of "$". Is this with all ARM processors and is there some sort of universal ARM handbook or something that would help me? I don't know why, but I assumed that the GBA used custom parts like the SNES. Speaking of the SNES, why are the GBA and SNES often compared? The GBA seems WAY more powerful (4 8bpp BG layers and 5x sprite overdraw FTW!), and not only that, It doesn't even appear to have been made even remotely the same way as the SNES. Actually, now that I think about it, they are both the only systems I know that use window layers and have BG mosaic capabilities.
- rainwarrior
- Posts: 8732
- Joined: Sun Jan 22, 2012 12:03 pm
- Location: Canada
- Contact:
Re: GBA ASM
The bullet points here should give you a quick rundown of what ARM is from a programming perspective: https://en.wikipedia.org/wiki/ARM_archi ... uction_set
There's no "accumulator", you can use arithmetic instructions on any register. The downside, though, is that you can't use arithmetic instructions directly with memory.
There's no "accumulator", you can use arithmetic instructions on any register. The downside, though, is that you can't use arithmetic instructions directly with memory.
Re: GBA ASM
Each processor has its own unique syntax.Espozo wrote:One thing that really threw me off is how the registers are called using "#" instead of "$". Is this with all ARM processors and is there some sort of universal ARM handbook or something that would help me?
It does, just not the CPU =P (but one thing to take note of: the ROM is using a 16-bit bus, so using thumb mode will be faster most of the time, I think only a portion of RAM has a 32-bit bus... consider learning how to jump to thumb mode and write most of your code with it if you want to have more performance)Espozo wrote:I don't know why, but I assumed that the GBA used custom parts like the SNES.
GBA is like a more powerful counterpart to the SNES, but if you look at it, you'll notice that it's for the most part a similar feature set to the SNES, just with better numbers. The one part that's really off would be the bitmap modes.Espozo wrote:Speaking of the SNES, why are the GBA and SNES often compared? The GBA seems WAY more powerful (4 8bpp BG layers and 5x sprite overdraw FTW!), and not only that, It doesn't even appear to have been made even remotely the same way as the SNES.
Mega Drive has a window layer, as does the original Game Boy... It's way more common than you think (although probably not implemented in the same way).Espozo wrote:Actually, now that I think about it, they are both the only systems I know that use window layers and have BG mosaic capabilities.
- Drew Sebastino
- Formerly Espozo
- Posts: 3496
- Joined: Mon Sep 15, 2014 4:35 pm
- Location: Richmond, Virginia
Re: GBA ASM
ARM processors use loading and storing. (I thought every processor used loading and storing and if they didn't, how could you even do anything?)rainwarrior wrote:The bullet points here should give you a quick rundown of what ARM is from a programming perspective: https://en.wikipedia.org/wiki/ARM_archi ... uction_setquote]
So this is basically what I got from each bullet point:
Early ARM processors didn't support 8bit and 16bit instructions?
I really have no clue as to what this is saying but I guess that it just means that you can put 16 and 32 bit instructions in the stack?
Isn't this bullet point the exact same thing as 2?
This means that there's no weird 21 megahertz (I think that's the number) speed that's really only about 3.5 megahertz like the on the Super Nintendo because everything needs to have been gone over a couple times?
How does this make any sense? It is saying that there is conditional instructions instead of conditional branching? Would it be like add if equal instead of branch if equal? (beq)
Kind of like the last bullet point except that it says that you don't actually need to do anything after you did a comparison? what would even be the point?
The next one says that you can easily multiply and divide 32bit instructions by 2 without much performance loss?
This just means it's easy to access different registers? I don't see how LDA $xxxx could be made any easier or harder.
What is a leaf instruction?
I don't know what the last one is saying but It appears to be talking about bank switching.
So that's why it's ld"R" instead of ld"A". Oh, and I wonder, is there still x and y?rainwarrior wrote:There's no "accumulator", you can use arithmetic instructions on any register.
could you give me an example as to what you mean?rainwarrior wrote:The downside, though, is that you can't use arithmetic instructions directly with memory.
I guess Window layers are a lot more common than I thought... Why is it that I don't ever see any kind of special effects like the keyhole in Super Mario World or the portal on level 6 of r-type 3? Is it because there's only 1 window layer instead of 2 like the SNES? Now that I think about it, I guess that the water in the Sonic The Hedgehog games are a window layer? I don't think I've ever seen any fancy tricks that look like a window layer on the GB except maybe the slime climb level on Donkey Kong Land 2.Sik wrote:Mega Drive has a window layer, as does the original Game Boy... It's way more common than you think (although probably not implemented in the same way).
Either I'm crazy, or you said that I should learn C because it's faster on the GBA than ASM.Sik wrote:the ROM is using a 16-bit bus, so using thumb mode will be faster most of the time, I think only a portion of RAM has a 32-bit bus... consider learning how to jump to thumb mode and write most of your code with it if you want to have more performance
Re: GBA ASM
In RISC processors, the only instructions that support a memory operand are load and store. The rest work entirely on registers and immediate values.Espozo wrote:ARM processors use loading and storing. (I thought every processor used loading and storing and if they didn't, how could you even do anything?)
Thumb and MIPS16 were introduced because ARM and MIPS felt that their architectures' code density on binary-size-constrained architectures couldn't compete with 8- and 16-bit MCUs.Early ARM processors didn't support 8bit and 16bit instructions?
Programs use 16-bit instructions, which can't efficiently do all the things 32-bit instructions can, because it takes two reads to get a 32-bit instruction out of ROM or main RAM. But there's a fast RAM called IWRAM, normally used for the stack, where you can copy 32-bit code such as an audio mixer or a texture mapper.I really have no clue as to what this is saying but I guess that it just means that you can put 16 and 32 bit instructions in the stack?
What "2"? Could you please use the quote markup if you're going to be referring to a particular point of an article? A Wikipedia article may change between the day you write your post and the day when someone else reads your post, possibly months or years later.Isn't this bullet point the exact same thing as 2?
There's the 16.78 MHz master clock. And as on the Super NES, different memory regions have different speeds: divide by 3 cycles to read or write main RAM and 2 or 4 cycles to read ROM.This means that there's no weird 21 megahertz (I think that's the number) speed that's really only about 3.5 megahertz like the on the Super Nintendo because everything needs to have been gone over a couple times?
Yes. In 32-bit ARM instructions (but not Thumb), any instruction can have the equivalent of branch conditions on it.It is saying that there is conditional instructions instead of conditional branching? Would it be like add if equal instead of branch if equal? (beq)
Yes. In 32-bit ARM instructions, you can multiply or divide one operand of each instruction by 2 as part of the same clock cycle.The next one says that you can easily multiply and divide 32bit instructions by 2 without much performance loss?
I think "Has powerful indexed addressing modes." refers to being able to add two different registers, one with a shift amount, to form an address.This just means it's easy to access different registers? I don't see how LDA $xxxx could be made any easier or harder.
Where do you get "leaf instruction"? The Wikipedia article as of today refers to a "leaf function", which is a subroutine or function that does not call any other subroutines or functions.What is a leaf instruction?
Nope, it's talking about having a separate set of registers to be used during interrupt handlers, so you don't have to waste time pushing the values of all registers to memory at the start of an interrupt handler and pulling them back.I don't know what the last one is saying but It appears to be talking about bank switching.
There are 14 numbered registers that can be used for data (like 6502 A) or address (like 6502 X and Y). The other two are reserved for the stack pointer (like 6502 S) and the program counter (like 6502 PC).Oh, and I wonder, is there still x and y?
You can't ADC with an address as an operand; you instead have to load the value from that address into memory.could you give me an example as to what you mean?rainwarrior wrote:The downside, though, is that you can't use arithmetic instructions directly with memory.
I wonder if part of the problem is that forum.gbadev.org has made itself invisible to external search engines to deter spambots from finding it, registering, and posting.
- Drew Sebastino
- Formerly Espozo
- Posts: 3496
- Joined: Mon Sep 15, 2014 4:35 pm
- Location: Richmond, Virginia
Re: GBA ASM
I meant the second bullet point I commented on, but you're right in that I really should have quoted it.tepples wrote:What "2"? Could you please use the quote markup if you're going to be referring to a particular point of an article? A Wikipedia article may change between the day you write your post and the day when someone else reads your post, possibly months or years later.Isn't this bullet point the exact same thing as 2?
I really wasn't aware that that website even had a forum. But by the look of the website, It looks like it's been long forgotten by most people in that it doesn't seem to have been updated in quite some time. (The "hardware" tab on the home page just loaded perpetually when I clicked it.)tepples wrote:I wonder if part of the problem is that forum.gbadev.org has made itself invisible to external search engines to deter spambots from finding it, registering, and posting.
- rainwarrior
- Posts: 8732
- Joined: Sun Jan 22, 2012 12:03 pm
- Location: Canada
- Contact:
Re: GBA ASM
This is to do with load/store architecture. On something like the 6502 where there are few registers, it wouldn't have been sensible to require all your operands to be in a register. For that reason, a lot of the time the operand is memory. On an ARM, you have to load it into a register first. When trying to optimize ARM code, it can be helpful to chose an algorithm that minimizes loads and stores, e.g. try to do a chain of operations before writing a variable back to RAM, instead of writing it back after each step.Espozo wrote:could you give me an example as to what you mean?rainwarrior wrote:The downside, though, is that you can't use arithmetic instructions directly with memory.
Code: Select all
; A contains value to be added to
ADC $0341
; A contains result
Code: Select all
; r0 contains value to be added to
LDR r1, $0341
ADC r2, r1, r0
; r2 contains result
On some platforms (like 6502), the C compilers available may not have very strong optimizers, in which case the performance difference might be magnified greatly.
The other thing to take into consideration is that usually only a small amount of the code in the game needs much optimization. Often the bulk of CPU time is spent in many repitions of a few specific tasks, and the rest of the code it won't make much of a difference if it's optimized. Spend your coding time wisely, and try not to waste a lot of effort optimizing things that aren't a perfomance problem. The most important part of optimization is measuring the time your code takes. This lets you know what should be optimized, and it also lets you know whether your optimization attempt was a success.
Re: GBA ASM
First of all, you seem to know only 6502 and 65c816 assembly at the time of your first post. You should be aware that every family of processors use completely different mechanisms for their registers, instruction sets and notation of their instructions. Think of every family of processors using different assembly languages.
The ARM family is extremely popular, unfortunately not thanks to the GBA but thanks to telephones and tablets (who I despise but that's another story). Thus it should be relatively easy to find ARM assembly language tutorials. You should learn ARM assembly from scratch even if you know 6502/65c816, because it's a completely different language. The ARM is also one of the most complex assembly language I know off, but one of the most elegant also. Don't worry it's still much simpler than pretty much any high-level languages to learn, though very difficult to master (I don't master it).
THUMB is yet another different assembly language, but it's similar to ARM with features taken off. It was designed to be more compact for code density (ARM is extremely bad in this respect, as all instructions are 32-bit, with THUMB all instructions are 16-bit instead, but you need to use more instructions to do the same task).
Something which is especially complex with ARM is the conditional status flag setting (-s suffix), the conditional execution of the instructions (-eq, etc.. suffix) and the stack management, along with the pseudo-instructions. The number of instructions themselves is quite small, but because of the pseudo-instructions and the suffixes, you might think there is way more instructions there actually is when seeing some example code.
There is also many different addressing modes.
In addition to pseudo instructions, there is also the shorthand notations for instructions. For example add r1, r2 will be a shorthand for add r1, r1, r2, LSL #0.
For instance, "beq" looks like it's the exact same as the 6502, but there is no "beq" instruction on ARM, it's a "b" instruction with "-eq" suffix. You can also do "addeq" or "moveq" for instance. With THUMB you will probably feel more at home because the instructions are more 6502 like, as the "-s" suffix is implied in all instructions and only branches are conditional. However, we are still limited to load-store architecture, we can't have an arbitrary constant in the code, and it's still only a compression of ARM assembly, which is more interesting to know to code really fast part in assembly.
If you are serious about coding a large-sized GBA game you don't want to do it all in assembly. You should also learn a mid/high level language (it doesn't have to be C).
Unlike 6502 and it's derivative, the ARM processor is very "C language friendly". Depending on the task you're doing, today's optimizers can be really good for normal code that doesn't do anything fancy, and therefore there is no significant loss in neither speed or code size when coding in C (or another mid level language). However, for specific tasks, such as bit shift or rotation, or additions with overflow checking in large loops, the C language can be terrible.
The ARM family is extremely popular, unfortunately not thanks to the GBA but thanks to telephones and tablets (who I despise but that's another story). Thus it should be relatively easy to find ARM assembly language tutorials. You should learn ARM assembly from scratch even if you know 6502/65c816, because it's a completely different language. The ARM is also one of the most complex assembly language I know off, but one of the most elegant also. Don't worry it's still much simpler than pretty much any high-level languages to learn, though very difficult to master (I don't master it).
THUMB is yet another different assembly language, but it's similar to ARM with features taken off. It was designed to be more compact for code density (ARM is extremely bad in this respect, as all instructions are 32-bit, with THUMB all instructions are 16-bit instead, but you need to use more instructions to do the same task).
Something which is especially complex with ARM is the conditional status flag setting (-s suffix), the conditional execution of the instructions (-eq, etc.. suffix) and the stack management, along with the pseudo-instructions. The number of instructions themselves is quite small, but because of the pseudo-instructions and the suffixes, you might think there is way more instructions there actually is when seeing some example code.
There is also many different addressing modes.
In addition to pseudo instructions, there is also the shorthand notations for instructions. For example add r1, r2 will be a shorthand for add r1, r1, r2, LSL #0.
For instance, "beq" looks like it's the exact same as the 6502, but there is no "beq" instruction on ARM, it's a "b" instruction with "-eq" suffix. You can also do "addeq" or "moveq" for instance. With THUMB you will probably feel more at home because the instructions are more 6502 like, as the "-s" suffix is implied in all instructions and only branches are conditional. However, we are still limited to load-store architecture, we can't have an arbitrary constant in the code, and it's still only a compression of ARM assembly, which is more interesting to know to code really fast part in assembly.
If you are serious about coding a large-sized GBA game you don't want to do it all in assembly. You should also learn a mid/high level language (it doesn't have to be C).
Unlike 6502 and it's derivative, the ARM processor is very "C language friendly". Depending on the task you're doing, today's optimizers can be really good for normal code that doesn't do anything fancy, and therefore there is no significant loss in neither speed or code size when coding in C (or another mid level language). However, for specific tasks, such as bit shift or rotation, or additions with overflow checking in large loops, the C language can be terrible.