GBA ASM

Discussion of development of software for any "obsolete" computer or video game system. See the WSdev wiki and ObscureDev wiki for more information on certain platforms.
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

GBA ASM

Post by Drew Sebastino »

Does anyone know of any good documents or tutorials for programing for the GBA using ASM? I looked on Google, but all I found was one not so helpful tutorial that didn't even appear to be finished on a website called patater.com. I also found a website called gbadev.org, but it only really has documentation on using C. The only thing I found was a GBA assembler called GoldRoad, I haven't even found a list with all the different registers, so I can't really do anything.

And out of curiosity, how many GBA games were even programed in ASM? I imagine most were written in C, as most people probably thought it is easier to write in C and hardly anything had still used ASM, so they didn't feel like learning it.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: GBA ASM

Post by tokumaru »

Apparently the GBA uses an ARM7 as its CPU, from the popular ARM family, so it shouldn't be so hard to find tools for it. As with any other console though, the CPU is the least of your worries... it's much more important to understand how graphics, sound and input work. Knowing the assembly language is useless if you don't know how to communicate with those other systems.

As for C vs. ASM, I think that the commitment to the platforms died after the 16-bit era. CPUs became fast enough to run games written in C, so there was no point in using assembly languages few programmers knew.

A lot of Game Boy Color games show that game companies didn't know how to work with 8/16-bit platforms anymore, because the graphics in several games didn't have any of the magic tricks from the NES days, and the gameplay was really stiff, almost like something you'd expect from chinese pirates. That was probably because of a total lack of commitment to the platform, which resulted in frustrated programmers trying to squeeze something out of ancient hardware.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: GBA ASM

Post by tepples »

Some companies were still committed enough to write their audio mixers and soft 3D texture mappers in asm, freeing up CPU time for C game logic.
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: GBA ASM

Post by Drew Sebastino »

The problem is that I don't know any of the registers or anything, and I've looked at some code written in C, but it looks like some sort of alien language. It doesn't even look like you directly load or store registers or anything. It's really sad, but I don't even know what this code is that I found that just turns the screen red. (in the assembler I'm using, "r" oddly represents the accumulator instead of "a")

Code: Select all

.text
main:
	mov r0, #0x4000000
	mov r1, #0x400
	add r1, r1, #3
	str r1, [r0]

	mov r0, #0x6000000
	mov r1, #0xFF
	mov r2, #0x9600
loop1:
	strh r1, [r0], #2
	subs r2, r2, #1
	bne loop1

infin:
	b infin
About the "hardware is fast enough to not worry about ASM" ideology, it is, of course, still always going to be more efficient if it is written in ASM no matter what. I imagine that a game with a game with as many characters, explosions, and bullets you could just about put on screen would slow down the GBA a bit if it were written in C unlike if it were written in ASM. Although It is probably a hard comparison, about how much faster is ASM than C assuming that they are both being used efficiently? One and half times as fast? Twice as fast?
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: GBA ASM

Post by tokumaru »

Espozo wrote:The problem is that I don't know any of the registers or anything, and I've looked at some code written in C, but it looks like some sort of alien language. It doesn't even look like you directly load or store registers or anything.
C is a high-level language, so you don't deal with registers and stuff directly.
About the "hardware is fast enough to not worry about ASM" ideology, it is, of course, still always going to be more efficient if it is written in ASM no matter what.
I completely agree, but I'm a hobbyist who codes games for fun, while the guys that program the games you buy at the store are payed to make them. Doing things at the lowest level (assembly) requires more specialized people, more time, and, consequently, more money. Companies, which have the goal of making money, not making the best possible games, do not see any benefit in spending more money.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: GBA ASM

Post by tepples »

Code: Select all

mov r0, #0x4000000
$04000000 is the base address of a large block of MMIO (memory-mapped input/output) ports. Consider it as a rough counterpart to $2000 and $4000 on NES or $002100 and $004200 on Super NES.

Code: Select all

mov r1, #0x400
add r1, r1, #3
ARM uses fixed-length instructions. This means constants sometimes have to be built a byte at a time. This forms the value $00000403 in r1.

Code: Select all

str r1, [r0]
This writes $00000403 to the PPU control port at $04000000. According to GBATEK, this sets the display mode to 3 (240x160 hi-color bitmap), disables forced blanking, and enables BG2.

Code: Select all

mov r0, #0x6000000
$06000000 is the start of video memory. Unlike the NES, the GBA PPU allows writing to video memory at any time.

Code: Select all

mov r1, #0xFF
$00FF is %0_00000_00111_11111, which is a color value.

Code: Select all

mov r2, #0x9600
$9600 is 240 times 160, the number of pixels on the screen.

Code: Select all

strh r1, [r0], #2
This stores the value in R1 at the address in R0 and advances R0 by 2 bytes. Because R0 points at video memory, this has the effect of plotting one pixel.

Code: Select all

subs r2, r2, #1
In 6502 this'd be dec r2.

Code: Select all

bne loop1
Keep looping until the number of bytes to clear becomes 0.

Code: Select all

infin:
  b infin
In 6502 this'd be forever: jmp forever
Espozo wrote:About the "hardware is fast enough to not worry about ASM" ideology, it is, of course, still always going to be more efficient if it is written in ASM no matter what.
By that point, time efficiency for programmers had become more important than runtime efficiency. If C helped your company get the game out the door a year earlier, then you'd use C.
Although It is probably a hard comparison, about how much faster is ASM than C assuming that they are both being used efficiently? One and half times as fast? Twice as fast?
It depends on what brand of compiler you use. I'm told the official GBA SDK came with GCC, and over the course of the GBA's life, GCC became somewhat better at generating code for ARM7TDMI. Some companies instead sprang for proprietary $6000/seat compilers from ARM, Green Hills, etc.
Sik
Posts: 1589
Joined: Thu Aug 12, 2010 3:43 am

Re: GBA ASM

Post by Sik »

tepples wrote:Some companies were still committed enough to write their audio mixers and soft 3D texture mappers in asm, freeing up CPU time for C game logic.
This was common in the mid-'90s too, use asm for very tight routines (usually anything that deals with some sort of rendering) and make everything else in C.
Espozo wrote:(in the assembler I'm using, "r" oddly represents the accumulator instead of "a")
Something makes me think that you don't understand how ARM processors work...
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: GBA ASM

Post by Drew Sebastino »

Sik wrote:Something makes me think that you don't understand how ARM processors work...
You're completely right. To be honest with you, I've never even heard of ARM before the GBA.

Oh, and tepples, thank you for figuring out the code! (In about 5 minutes, nonetheless :shock: ) One thing that really threw me off is how the registers are called using "#" instead of "$". Is this with all ARM processors and is there some sort of universal ARM handbook or something that would help me? I don't know why, but I assumed that the GBA used custom parts like the SNES. Speaking of the SNES, why are the GBA and SNES often compared? The GBA seems WAY more powerful (4 8bpp BG layers and 5x sprite overdraw FTW!), and not only that, It doesn't even appear to have been made even remotely the same way as the SNES. Actually, now that I think about it, they are both the only systems I know that use window layers and have BG mosaic capabilities.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: GBA ASM

Post by rainwarrior »

The bullet points here should give you a quick rundown of what ARM is from a programming perspective: https://en.wikipedia.org/wiki/ARM_archi ... uction_set

There's no "accumulator", you can use arithmetic instructions on any register. The downside, though, is that you can't use arithmetic instructions directly with memory.
Sik
Posts: 1589
Joined: Thu Aug 12, 2010 3:43 am

Re: GBA ASM

Post by Sik »

Espozo wrote:One thing that really threw me off is how the registers are called using "#" instead of "$". Is this with all ARM processors and is there some sort of universal ARM handbook or something that would help me?
Each processor has its own unique syntax.
Espozo wrote:I don't know why, but I assumed that the GBA used custom parts like the SNES.
It does, just not the CPU =P (but one thing to take note of: the ROM is using a 16-bit bus, so using thumb mode will be faster most of the time, I think only a portion of RAM has a 32-bit bus... consider learning how to jump to thumb mode and write most of your code with it if you want to have more performance)
Espozo wrote:Speaking of the SNES, why are the GBA and SNES often compared? The GBA seems WAY more powerful (4 8bpp BG layers and 5x sprite overdraw FTW!), and not only that, It doesn't even appear to have been made even remotely the same way as the SNES.
GBA is like a more powerful counterpart to the SNES, but if you look at it, you'll notice that it's for the most part a similar feature set to the SNES, just with better numbers. The one part that's really off would be the bitmap modes.
Espozo wrote:Actually, now that I think about it, they are both the only systems I know that use window layers and have BG mosaic capabilities.
Mega Drive has a window layer, as does the original Game Boy... It's way more common than you think (although probably not implemented in the same way).
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: GBA ASM

Post by Drew Sebastino »

rainwarrior wrote:The bullet points here should give you a quick rundown of what ARM is from a programming perspective: https://en.wikipedia.org/wiki/ARM_archi ... uction_setquote]
So this is basically what I got from each bullet point:
ARM processors use loading and storing. (I thought every processor used loading and storing and if they didn't, how could you even do anything?)

Early ARM processors didn't support 8bit and 16bit instructions?

I really have no clue as to what this is saying but I guess that it just means that you can put 16 and 32 bit instructions in the stack?

Isn't this bullet point the exact same thing as 2?

This means that there's no weird 21 megahertz (I think that's the number) speed that's really only about 3.5 megahertz like the on the Super Nintendo because everything needs to have been gone over a couple times?

How does this make any sense? It is saying that there is conditional instructions instead of conditional branching? Would it be like add if equal instead of branch if equal? (beq)

Kind of like the last bullet point except that it says that you don't actually need to do anything after you did a comparison? what would even be the point?

The next one says that you can easily multiply and divide 32bit instructions by 2 without much performance loss?

This just means it's easy to access different registers? I don't see how LDA $xxxx could be made any easier or harder.

What is a leaf instruction?

I don't know what the last one is saying but It appears to be talking about bank switching.
rainwarrior wrote:There's no "accumulator", you can use arithmetic instructions on any register.
So that's why it's ld"R" instead of ld"A". Oh, and I wonder, is there still x and y?
rainwarrior wrote:The downside, though, is that you can't use arithmetic instructions directly with memory.
could you give me an example as to what you mean?
Sik wrote:Mega Drive has a window layer, as does the original Game Boy... It's way more common than you think (although probably not implemented in the same way).
I guess Window layers are a lot more common than I thought... :lol: Why is it that I don't ever see any kind of special effects like the keyhole in Super Mario World or the portal on level 6 of r-type 3? Is it because there's only 1 window layer instead of 2 like the SNES? Now that I think about it, I guess that the water in the Sonic The Hedgehog games are a window layer? I don't think I've ever seen any fancy tricks that look like a window layer on the GB except maybe the slime climb level on Donkey Kong Land 2.
Sik wrote:the ROM is using a 16-bit bus, so using thumb mode will be faster most of the time, I think only a portion of RAM has a 32-bit bus... consider learning how to jump to thumb mode and write most of your code with it if you want to have more performance
Either I'm crazy, or you said that I should learn C because it's faster on the GBA than ASM.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: GBA ASM

Post by tepples »

Espozo wrote:ARM processors use loading and storing. (I thought every processor used loading and storing and if they didn't, how could you even do anything?)
In RISC processors, the only instructions that support a memory operand are load and store. The rest work entirely on registers and immediate values.
Early ARM processors didn't support 8bit and 16bit instructions?
Thumb and MIPS16 were introduced because ARM and MIPS felt that their architectures' code density on binary-size-constrained architectures couldn't compete with 8- and 16-bit MCUs.
I really have no clue as to what this is saying but I guess that it just means that you can put 16 and 32 bit instructions in the stack?
Programs use 16-bit instructions, which can't efficiently do all the things 32-bit instructions can, because it takes two reads to get a 32-bit instruction out of ROM or main RAM. But there's a fast RAM called IWRAM, normally used for the stack, where you can copy 32-bit code such as an audio mixer or a texture mapper.
Isn't this bullet point the exact same thing as 2?
What "2"? Could you please use the quote markup if you're going to be referring to a particular point of an article? A Wikipedia article may change between the day you write your post and the day when someone else reads your post, possibly months or years later.
This means that there's no weird 21 megahertz (I think that's the number) speed that's really only about 3.5 megahertz like the on the Super Nintendo because everything needs to have been gone over a couple times?
There's the 16.78 MHz master clock. And as on the Super NES, different memory regions have different speeds: divide by 3 cycles to read or write main RAM and 2 or 4 cycles to read ROM.
It is saying that there is conditional instructions instead of conditional branching? Would it be like add if equal instead of branch if equal? (beq)
Yes. In 32-bit ARM instructions (but not Thumb), any instruction can have the equivalent of branch conditions on it.
The next one says that you can easily multiply and divide 32bit instructions by 2 without much performance loss?
Yes. In 32-bit ARM instructions, you can multiply or divide one operand of each instruction by 2 as part of the same clock cycle.
This just means it's easy to access different registers? I don't see how LDA $xxxx could be made any easier or harder.
I think "Has powerful indexed addressing modes." refers to being able to add two different registers, one with a shift amount, to form an address.
What is a leaf instruction?
Where do you get "leaf instruction"? The Wikipedia article as of today refers to a "leaf function", which is a subroutine or function that does not call any other subroutines or functions.
I don't know what the last one is saying but It appears to be talking about bank switching.
Nope, it's talking about having a separate set of registers to be used during interrupt handlers, so you don't have to waste time pushing the values of all registers to memory at the start of an interrupt handler and pulling them back.
Oh, and I wonder, is there still x and y?
There are 14 numbered registers that can be used for data (like 6502 A) or address (like 6502 X and Y). The other two are reserved for the stack pointer (like 6502 S) and the program counter (like 6502 PC).
rainwarrior wrote:The downside, though, is that you can't use arithmetic instructions directly with memory.
could you give me an example as to what you mean?
You can't ADC with an address as an operand; you instead have to load the value from that address into memory.

I wonder if part of the problem is that forum.gbadev.org has made itself invisible to external search engines to deter spambots from finding it, registering, and posting.
User avatar
Drew Sebastino
Formerly Espozo
Posts: 3496
Joined: Mon Sep 15, 2014 4:35 pm
Location: Richmond, Virginia

Re: GBA ASM

Post by Drew Sebastino »

tepples wrote:
Isn't this bullet point the exact same thing as 2?
What "2"? Could you please use the quote markup if you're going to be referring to a particular point of an article? A Wikipedia article may change between the day you write your post and the day when someone else reads your post, possibly months or years later.
I meant the second bullet point I commented on, but you're right in that I really should have quoted it.
tepples wrote:I wonder if part of the problem is that forum.gbadev.org has made itself invisible to external search engines to deter spambots from finding it, registering, and posting.
I really wasn't aware that that website even had a forum. :oops: But by the look of the website, It looks like it's been long forgotten by most people in that it doesn't seem to have been updated in quite some time. (The "hardware" tab on the home page just loaded perpetually when I clicked it.)
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: GBA ASM

Post by rainwarrior »

Espozo wrote:
rainwarrior wrote:The downside, though, is that you can't use arithmetic instructions directly with memory.
could you give me an example as to what you mean?
This is to do with load/store architecture. On something like the 6502 where there are few registers, it wouldn't have been sensible to require all your operands to be in a register. For that reason, a lot of the time the operand is memory. On an ARM, you have to load it into a register first. When trying to optimize ARM code, it can be helpful to chose an algorithm that minimizes loads and stores, e.g. try to do a chain of operations before writing a variable back to RAM, instead of writing it back after each step.

Code: Select all

; A contains value to be added to
ADC $0341
; A contains result
VS

Code: Select all

; r0 contains value to be added to
LDR r1, $0341
ADC r2, r1, r0
; r2 contains result
As for whether assembly is faster than C, well, yes it is. As a human who knows more about what your code does than the compiler, you can produce better optimized code than that compiler. A human optimizer takes vastly more time to do its work than a compiler does, though, so in that respect it is slower to write.

On some platforms (like 6502), the C compilers available may not have very strong optimizers, in which case the performance difference might be magnified greatly.

The other thing to take into consideration is that usually only a small amount of the code in the game needs much optimization. Often the bulk of CPU time is spent in many repitions of a few specific tasks, and the rest of the code it won't make much of a difference if it's optimized. Spend your coding time wisely, and try not to waste a lot of effort optimizing things that aren't a perfomance problem. The most important part of optimization is measuring the time your code takes. This lets you know what should be optimized, and it also lets you know whether your optimization attempt was a success.
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: GBA ASM

Post by Bregalad »

First of all, you seem to know only 6502 and 65c816 assembly at the time of your first post. You should be aware that every family of processors use completely different mechanisms for their registers, instruction sets and notation of their instructions. Think of every family of processors using different assembly languages.

The ARM family is extremely popular, unfortunately not thanks to the GBA but thanks to telephones and tablets (who I despise but that's another story). Thus it should be relatively easy to find ARM assembly language tutorials. You should learn ARM assembly from scratch even if you know 6502/65c816, because it's a completely different language. The ARM is also one of the most complex assembly language I know off, but one of the most elegant also. Don't worry it's still much simpler than pretty much any high-level languages to learn, though very difficult to master (I don't master it).

THUMB is yet another different assembly language, but it's similar to ARM with features taken off. It was designed to be more compact for code density (ARM is extremely bad in this respect, as all instructions are 32-bit, with THUMB all instructions are 16-bit instead, but you need to use more instructions to do the same task).

Something which is especially complex with ARM is the conditional status flag setting (-s suffix), the conditional execution of the instructions (-eq, etc.. suffix) and the stack management, along with the pseudo-instructions. The number of instructions themselves is quite small, but because of the pseudo-instructions and the suffixes, you might think there is way more instructions there actually is when seeing some example code.
There is also many different addressing modes.

In addition to pseudo instructions, there is also the shorthand notations for instructions. For example add r1, r2 will be a shorthand for add r1, r1, r2, LSL #0.

For instance, "beq" looks like it's the exact same as the 6502, but there is no "beq" instruction on ARM, it's a "b" instruction with "-eq" suffix. You can also do "addeq" or "moveq" for instance. With THUMB you will probably feel more at home because the instructions are more 6502 like, as the "-s" suffix is implied in all instructions and only branches are conditional. However, we are still limited to load-store architecture, we can't have an arbitrary constant in the code, and it's still only a compression of ARM assembly, which is more interesting to know to code really fast part in assembly.

If you are serious about coding a large-sized GBA game you don't want to do it all in assembly. You should also learn a mid/high level language (it doesn't have to be C).
Unlike 6502 and it's derivative, the ARM processor is very "C language friendly". Depending on the task you're doing, today's optimizers can be really good for normal code that doesn't do anything fancy, and therefore there is no significant loss in neither speed or code size when coding in C (or another mid level language). However, for specific tasks, such as bit shift or rotation, or additions with overflow checking in large loops, the C language can be terrible.
Post Reply