A newbie's code evaluation

Are you new to 6502, NES, or even programming in general? Post any of your questions here. Remember - the only dumb question is the question that remains unasked.

Moderator: Moderators

User avatar
zanto
Posts: 57
Joined: Sun Mar 07, 2021 11:15 pm
Location: Rio de Janeiro, Brazil

A newbie's code evaluation

Post by zanto » Mon Mar 08, 2021 12:18 am

Hello! This week I started studying NES homebrew development. I've been reading the nesdev wiki, watching a bunch of tutorials on Youtube and reading random sites with tutorials, 6502 references and all that jazz. So far, I have made a very simple NES demo with a ship that can be moved using the d-pad and a scrolling background.

I've attached to this post what I've worked on so far. I wonder if it'd be possible for a more experienced developer to look at my source code and point out improvements for it. I know worrying about optimizations and things like that is probably not necessary since I've just started learning, but at least I'd like to know more and avoid developing bad habits, specially when it comes to drawing sprites and processing input which is a super basic thing for any game. :) After reading about how limited is the NES hardware, I got kinda paranoid!

I have no idea if this is something I can ask other people to do, so I apologize if this ends up sounding rude.

A few problems I've run into and had to work around:
- I couldn't use a branch instruction because the label it pointed to was too far away.
- I couldn't come up with a generic way to draw 16x16 sprites. I have a routine to draw a player sprite, but if I have to do something like that for every 16x16 sprite, I'll end up using a lot of memory just for that.

Also, I only program as a hobby, so I apologize if my code is too messy...
Attachments
Secondgame.zip
(20.11 KiB) Downloaded 35 times

calima
Posts: 1329
Joined: Tue Oct 06, 2015 10:16 am

Re: A newbie's code evaluation

Post by calima » Mon Mar 08, 2021 12:45 am

1. ca65 has macros that automatically use a branch or a comparison-jump depending on the distance. "jne" instead of "bne", etc.
2. You don't have to use all caps everywhere.
3. You're doing everything manually. If that's intentional, then fine, but I would recommend using neslib.

User avatar
DRW
Posts: 2070
Joined: Sat Sep 07, 2013 2:59 pm

Re: A newbie's code evaluation

Post by DRW » Mon Mar 08, 2021 2:54 am

If he's learning the language, I wouldn't suggest using cc65's macros since they're not part of the language. It's better to learn and understand what is done in this case:

If a conditional jump is not possible because the label is too far away, you need to combine the opposite condition with a regular JMP.

Pseudo code:

Code: Select all

// If this doesn't work because the label is too far away:
if a = b then
    goto label // BEQ label
end if
...

// use this:
if a ≠ b then
    goto nonEqual // BNE nonEqual
end if
goto label // JMP label
nonEqual:
...

Likewise, I wouldn't suggest to use neslib. If you learn NES programming, you should understand what you're doing, so you shouldn't use a library that encapsulates mundane stuff away.


For drawing any 16 x 16 sprite, you can use the same function that you have and then use some global pointer variable where you first pass the address of your current sprite to.

I.e. instead of using

Code: Select all

LDA SpriteData+1
STA SPRITE_MEM, Y				
INY
you use

Code: Select all

LDA (CurrentSprite), Y
INY
STA SPRITE_MEM, X
INX
This is called indirect addressing and is pretty much the equivalent of accessing arrays through a pointer in C:

Code: Select all

char array[5] = { 2, 4, 6, 8, 10 };
char *pointer = array;
char index = 2;
char value;

value = array[2]; // LDA array + 2
value = array[index]; // LDA array, Y
value = pointer[index]; // LDA (pointer), Y
My game "City Trouble": www.denny-r-walter.de/city.htm

User avatar
zanto
Posts: 57
Joined: Sun Mar 07, 2021 11:15 pm
Location: Rio de Janeiro, Brazil

Re: A newbie's code evaluation

Post by zanto » Mon Mar 08, 2021 1:18 pm

Thank you for the feedback! :D
1. ca65 has macros that automatically use a branch or a comparison-jump depending on the distance. "jne" instead of "bne", etc.
2. You don't have to use all caps everywhere.
3. You're doing everything manually. If that's intentional, then fine, but I would recommend using neslib.
Yeah, for now I'm just trying to learn how 6502 assembly works, but it's great to know those things! Are those ca65 macros efficient in terms of bytes used and clock number?
I'll also look up neslib. It sounds like it'll be helpful as I get more experienced.

------------------------------
If a conditional jump is not possible because the label is too far away, you need to combine the opposite condition with a regular JMP.
That's what I did when I ran into the problem with the branch instruction. I was just wondering if there was a more efficient way to do it.

------------------------------
For drawing any 16 x 16 sprite, you can use the same function that you have and then use some global pointer variable where you first pass the address of your current sprite to.

Code: Select all

LDA (CurrentSprite), Y
INY
STA SPRITE_MEM, X
INX

One problem with that suggestion is that I'm already using the X register to iterate through all the entities to draw on screen. It's also used in the sprite draw routine to get the x,y position of the entities

Code: Select all

LDA entities+Entity::ypos, X
Is there a way to make it work? I thought about putting the value in X in the stack, but that doesn't sound efficient...

Maybe I could also store the x,y in addresses in the zero page and load them in the sprite loading subroutine, but that also doesn't sound super efficient... In the zero page, I'd have these extra allocations and use A to get values and store values in them
So, here's what I got so far

Code: Select all

.segment "ZEROPAGE"
currentsprite:	.res 2
spritexpos:	.res 1
spriteypos:	.res 1

(...)
.segment "CODE"
(...)

LOAD_SPRITES:
	CPX #ENTITIES_SIZE
	BNE :+
	JMP DONE_LOADING_SPRITES
:
	LDA entities+Entity::type, X	
	CMP #EntityType::PlayerType
	BNE :+
	JMP LOAD_PLAYER_SPRITE
:
AFTER_SPRITE_LOAD:
	INX
	INX
	INX
	JMP LOAD_SPRITES
	
LOAD_PLAYER_SPRITE:
	LDA #<(SpriteData+$10)	; storing sprite data for player entity. It starts at byte $10 in SpriteData
	STA currentsprite
	LDA #>(SpriteData+$10)
	STA currentsprite+1
	JMP LOAD_16_16_SPRITE

LOAD_16_16_SPRITE:		; draw player sprite (all 4 tiles)
	;
	; top left
	LDY #$00
	LDA entities+Entity::ypos, X	; store y pos
	STA SPRITE_MEM, Y
	INY
	LDA (currentsprite), Y				; store tile id
	STA SPRITE_MEM, Y				
	INY
	LDA (currentsprite), Y				; store tile attr
	STA SPRITE_MEM, Y				
	INY
	LDA entities+Entity::xpos, X	; store x pos
	STA SPRITE_MEM, Y			
	INY
	; top right
	LDA entities+Entity::ypos, X	; store y pos
	STA SPRITE_MEM, Y
	INY
	LDA (currentsprite), Y				; store tile id
	STA SPRITE_MEM, Y				
	INY
	LDA (currentsprite), Y				; store tile attr
	STA SPRITE_MEM, Y				
	INY
	LDA entities+Entity::xpos, X	; store x pos
	CLC
	ADC #$08
	STA SPRITE_MEM, Y			
	INY
	; bottom left
	LDA entities+Entity::ypos, X	; store y pos
	CLC
	ADC #$08
	STA SPRITE_MEM, Y
	INY
	LDA (currentsprite), Y				; store tile id
	STA SPRITE_MEM, Y				
	INY
	LDA (currentsprite), Y				; store tile attr
	STA SPRITE_MEM, Y				
	INY
	LDA entities+Entity::xpos, X	; store x pos
	STA SPRITE_MEM, Y			
	INY
	; bottom right
	LDA entities+Entity::ypos, X	; store y pos
	CLC
	ADC #$08
	STA SPRITE_MEM, Y
	INY
	LDA (currentsprite), Y				; store tile id
	STA SPRITE_MEM, Y				
	INY
	LDA (currentsprite), Y				; store tile attr
	STA SPRITE_MEM, Y				
	INY
	LDA entities+Entity::xpos, X	; store x pos
	CLC
	ADC #$08
	STA SPRITE_MEM, Y			
	INY
	JMP AFTER_SPRITE_LOAD
	

DONE_LOADING_SPRITES:	
	CLI		
	LDA #%10001000	
	STA PPU_CTRL	
	LDA #%00011110	
	STA PPU_MASK	

------------------------------

Oh, another question: which IDE do you guys think it's easier to use? I've been using Notepad++ with 6502 syntax highlighting. I tried Nesicide, but I just got a bunch of problems and I couldn't figure out how to use it well.

unregistered
Posts: 1143
Joined: Thu Apr 23, 2009 11:21 pm
Location: cypress, texas

Re: A newbie's code evaluation

Post by unregistered » Mon Mar 08, 2021 3:56 pm

zanto wrote:
Mon Mar 08, 2021 1:18 pm
If a conditional jump is not possible because the label is too far away, you need to combine the opposite condition with a regular JMP.
That's what I did when I ran into the problem with the branch instruction. I was just wondering if there was a more efficient way to do it.
Hi zanto. :)

A more efficient way of doing it... well, you can chain branches.

Code: Select all

$C200 lda $03
$C202 bne +toofar

...

$C283 +toofar
may work more efficiently like this:

Code: Select all

$C200 lda $03
$C202 bne +closer

...

$C242 +closer bne +toofar

...

$C283 +toofar
Since the status flags only change relative to specific instructions, the zero flag not set after lda $03 will stay not set at +closer bc branches do NOT affect status flags. :) So, in the second code example you could rename +toofar; it’s not too far anymore when the first branch is chained through +closer.

(I’m using an asm6 fork... don’t know much about NESASM and ca65, but chaining branches should work there too.)

zanto wrote:
Mon Mar 08, 2021 1:18 pm
Oh, another question: which IDE do you guys think it's easier to use? I've been using Notepad++ with 6502 syntax highlighting. I tried Nesicide, but I just got a bunch of problems and I couldn't figure out how to use it well.
I’ve always used Programmer’s Notepad :D ; has many syntax highlightings, including Assembly... is possible, though certainly not easy, to change any highlighting (i.e. Assembly) into, like for me, asm6 highlighting using any desired colors. Programmer’s Notepad wasn’t recommended by others in a previous similar question, but you asked. :)

unregistered
Posts: 1143
Joined: Thu Apr 23, 2009 11:21 pm
Location: cypress, texas

Re: A newbie's code evaluation

Post by unregistered » Mon Mar 08, 2021 5:12 pm

unregistered wrote:
Mon Mar 08, 2021 3:56 pm
Since the status flags only change relative to specific instructions, the zero flag not set after lda $03 will stay not set at +closer bc branches do NOT affect status flags. :) So, in the second code example you could rename +toofar; it’s not too far anymore when the first branch is chained through +closer.
Also, knowing the limitations of branches can help you to arrange/rewrite your code so a simple branch (not a chain) can be used.

The 6502 creates an opcode pair: F0 07 for beq + if the + label is 7 bytes away, going down. (The F0 is used for beq.)

That second byte, the 07, is always read by 6502 assemblers, for branches, as a signed byte. Therefore, the maximum distance allowed by a forward branch is 127 bytes, $7F, and the maximum distance allowed by a reverse branch is -128, $80.


If, say, you use a loop, instead of unrolled loops, the distance trying to be recorded by that forward branch may become 127 or less, and thus, you won’t receive that “branch too long” error message when using a simple branch, as you started with. :)

User avatar
DRW
Posts: 2070
Joined: Sat Sep 07, 2013 2:59 pm

Re: A newbie's code evaluation

Post by DRW » Mon Mar 08, 2021 6:22 pm

zanto wrote:
Mon Mar 08, 2021 1:18 pm
I was just wondering if there was a more efficient way to do it.
Not that I'm aware of. BEQ etc. always uses one byte, so it's 127 bytes forward and -128 bytes backwards since it's treated as a signed byte. Only JMP uses two bytes and therefore goes over the entire address range.

zanto wrote:
Mon Mar 08, 2021 1:18 pm
One problem with that suggestion is that I'm already using the X register to iterate through all the entities to draw on screen.
Yeah, that's a general problem when you have more than two loops and you need a register for each of them.
In this case, you have to juggle with the register values.

You can either use additional variables as loop counters: LDX/LDY to read them, INX/INY to increment, STX/STY to write them back. Then you can use X/Y for another value.
Sure, it's a bit less efficient, but what do you want to do if you have only two registers, but three or more loop dimensions?

Or, as you said, you can put a value to the stack with TXA/TYA, PHA and then PLA, TAX/TAY. Again, surely less efficient than only using X and Y, but a necessity when you have two registers, but need more than two. And it's still better than having a buffer variable since TXA, PHA, PLA, TAX is shorter than STX Variable, LDX Variable.

If you don't need a certain counter for an address offset, but simply as a counter, you don't even need to put it into X or Y. I.e. if you simply want a loop from 1 to 10, and then inside that loop, you need X and Y for different things, then you wouldn't even put the 1 to 10 counter into X or Y to begin with. You would simply do:

Code: Select all

LDA #10
STA Counter
loop:
...
DEC Counter
BNE loop
In your specific case, I wouldn't sacrifice the X register for the outermost loop of iterating over the sprite entities. That outer loop is the one that changes its index values the fewest times, so you can use a variable here or use the stack to save the current entity index if possible. X and Y should be used for your inner loops.

There are several ways to do this. You need to experiment with them. If you can reduce the usage of counters and indices, that's fine. But if you can't, then variables and the stack are actually the intended way to do this.

For example, you won't always be able to use the same index for your current sprite entity array and for the index in the hardware sprites. After all, what if your game can draw arbitrary sprites in an arbitrary order on the screen? Then the offset for your 2x2 tiles entity will always be a value between 0 and 15, but the current index for the hardware sprites can be an index from 0 to 63.

Same when you decide to not store your sprites in the typical NES format (y, tile, attributes x) since this might be a waste of memory: The x and y position is variable anyway, so why waste a byte for them in ROM that you never really need?
So, maybe your sprite is stored like this: Width, height, palette, tile1, tile2, ..., tileN.
In this case, you need the Y register as an index to your sprite entity array. And X as a totally independent index to the hardware sprites.
But you also need a counter based on the width: As long as it's not 0, you draw the sprites in a horizontal line. As soon as the counter is 0, you increment the vertical position, set the horizontal position back to the original width value and then start the width-based counter again.

For all these different things, you need individual ways:

The sprite entity index can be used by Y alone: Simply do LDY #0 at the start of the function, then use INY until the end.
The width counter is a counter that is only used via a variable: LDA Width, STA HorizontalCounter, DEC HorizontalCounter, BNE innerLoop, BEQ outerLoop.
The hardware sprite index is X combined with a variable:

Code: Select all

LDX CurrentSpriteIndex

LDA CurrentYPosition
STX Sprites, X
INX

LDA (CurrentSpriteEntity), Y
INY
STX Sprites, X
INX

LDA CurrentAttributes
STX Sprites, X
INX

LDA CurrentXPosition
STX Sprites, X
INX

; Who knows what other code overwrites X between now and the next call of this function,
; so let's better save the index, unless you can 100 % rely on the fact
; that X will never be used for anything else during the entire sprite rendering.
STX CurrentSpriteIndex
My game "City Trouble": www.denny-r-walter.de/city.htm

unregistered
Posts: 1143
Joined: Thu Apr 23, 2009 11:21 pm
Location: cypress, texas

Re: A newbie's code evaluation

Post by unregistered » Mon Mar 08, 2021 9:25 pm

DRW wrote:
Mon Mar 08, 2021 6:22 pm
zanto wrote:
Mon Mar 08, 2021 1:18 pm
I was just wondering if there was a more efficient way to do it.
Not that I'm aware of. BEQ etc. always uses one byte, so it's 127 bytes forward and -128 bytes backwards since it's treated as a signed byte. Only JMP uses two bytes and therefore goes over the entire address range.
DRW, did you miss my post?

DRW wrote:
Mon Mar 08, 2021 6:22 pm
Yeah, that's a general problem when you have more than two loops and you need a register for each of them.
In this case, you have to juggle with the register values.

You can either use additional variables as loop counters: LDX/LDY to read them, INX/INY to increment, STX/STY to write them back. Then you can use X/Y for another value.
Sure, it's a bit less efficient, but what do you want to do if you have only two registers, but three or more loop dimensions?

Or, as you said, you can put a value to the stack with TXA/TYA, PHA and then PLA, TAX/TAY. Again, surely less efficient than only using X and Y, but a necessity when you have two registers, but need more than two
Hmmm... actually there are more than two registers... the stack pointer is a register too. :)

If you make sure to save the stack pointer before the loop and restore the stack pointer afterwards, and your loop does not push or pull the stack, then feel free to also use that stack pointer register too. Access the stack pointer by:

Code: Select all

txs ;save a value to the stack pointer register
tsx ;read a value from that register

;txs and tsx are each 1byte large and
   ;each only take 2cycles

;CAUTION: if you fail to restore the stack pointer
;register to its old valid value, your game running
;WILL, most likely, perish bc the stack pointer
;register points to the current stack value (i.e. 
;after a jsr, two bytes representing the current
;address are pushed on the place on the stack
;where the stack pointer is pointing / rts removes
;two bytes from that pointed to address).

edit: oh yes, and make sure interrupts are disabled. If vblank is the only interrupt, make sure your code using the stack pointer register does NOT run while vblank starts.

User avatar
DRW
Posts: 2070
Joined: Sat Sep 07, 2013 2:59 pm

Re: A newbie's code evaluation

Post by DRW » Tue Mar 09, 2021 2:09 am

unregistered wrote:
Mon Mar 08, 2021 9:25 pm
DRW wrote:
Mon Mar 08, 2021 6:22 pm
zanto wrote:
Mon Mar 08, 2021 1:18 pm
I was just wondering if there was a more efficient way to do it.
Not that I'm aware of. BEQ etc. always uses one byte, so it's 127 bytes forward and -128 bytes backwards since it's treated as a signed byte. Only JMP uses two bytes and therefore goes over the entire address range.
DRW, did you miss my post?
No, I didn't. But the question was directed to my post anyway. So I gave him my own answer.

Besides, I don't approve of your version:

Code: Select all

$C200 lda $03
$C202 bne +closer

...

$C242 +closer bne +toofar

...

$C283 +toofar
Why should I clutter my code with branch instructions in the middle of completely unrelated code?
Now you have to think about the instruction before $C242 jumping to the instruction after $C242. This is unclean hack style.
This is cleaner:

Code: Select all

lda $03
beq +conditionNotFulfilled
jmp +toofar
+conditionNotFulfilled

...

+toofar
unregistered wrote:
Mon Mar 08, 2021 9:25 pm
Hmmm... actually there are more than two registers... the stack pointer is a register too. :)
Yeah, exactly: Advise the beginner in the language to abuse the stack pointer, the one thing that takes care of your global program flow, to do something completely unintended that can screw up your program in spectacular ways if you aren't cautious. :roll:

This alone:
unregistered wrote:
Mon Mar 08, 2021 9:25 pm
edit: oh yes, and make sure interrupts are disabled. If vblank is the only interrupt, make sure your code using the stack pointer register does NOT run while vblank starts.
is the reason why something like that shouldn't even be considered in the first place. How do you ever want to make sure that your code will 100% guaranteed not enter vblank time?

So, yeah: Stack pointer as an additional register my ass.
There are three registers (A, X, Y) as well as the PHA/PLA feature for temporary storing and that's it. Apart from the initialization at the start of the program, don't ever touch the stack pointer!
My game "City Trouble": www.denny-r-walter.de/city.htm

unregistered
Posts: 1143
Joined: Thu Apr 23, 2009 11:21 pm
Location: cypress, texas

Re: A newbie's code evaluation

Post by unregistered » Tue Mar 09, 2021 10:26 am

DRW wrote:
Tue Mar 09, 2021 2:09 am
Why should I clutter my code with branch instructions in the middle of completely unrelated code?
Now you have to think about the instruction before $C242 jumping to the instruction after $C242. This is unclean hack style.
Well, the idea, for me, was to find another bne, if you are attempting to chain bne, that’s already in the middle of your code that branches even lower, create a new label to that middle bne, and branch to that label. This has worked for me multiple times.

Yes, it would be kind of unwise to create a middle bne branch just for chaining. :)

edit: It’s possible to rearrange/edit code so that a middle beq can be turned into a bne, for chaining bne, but can that take some serious effort. Or, you might possibly make your initial bne a beq, and chain those beq. I’ve done this by allowing a zeropage flag to be on when it’s zero/off when it’s non-zero. There are many different ways to allow chaining of branches to work efficiently. :)

DRW wrote:
Tue Mar 09, 2021 2:09 am
There are three registers (A, X, Y) as well as the PHA/PLA feature for temporary storing and that's it. Apart from the initialization at the start of the program, don't ever touch the stack pointer!
Well, the stack pointer definitely sits in a register and using it correctly can create massive speed improvements. It has helped me multiple times too. :)

And vblank only runs after frame scanline 240... so, if, say, my stack pointer register code runs around frame scanline 120 and doesn’t take nearly 120 frames scanlines to complete, that’s good enough for me. :)
Last edited by unregistered on Tue Mar 09, 2021 11:20 am, edited 1 time in total.

stan423321
Posts: 43
Joined: Wed Sep 09, 2020 3:08 am

Re: A newbie's code evaluation

Post by stan423321 » Tue Mar 09, 2021 11:06 am

Gwnerally I am a fan of unorthodox coding techniques, but this sounds like advocating for cooking pasta in sulfuric acid without explaining what's the point. You lose interrupt safety and subroutine options, and get... what exactly? STX/LDX over zero page are potentially slower than TSX/TXS, but they're so much more flexible that it almost doesn't matter. An NES game will spend most of its runtime with at least NMI on anyway, unless it's, like, terrible, so the safe interval is the sort of thing that can be slow if it wants to. This S misuse definitely doesn't sound like something a beginner should consider.

unregistered
Posts: 1143
Joined: Thu Apr 23, 2009 11:21 pm
Location: cypress, texas

Re: A newbie's code evaluation

Post by unregistered » Tue Mar 09, 2021 11:31 am

stan423321 wrote:
Tue Mar 09, 2021 11:06 am
Gwnerally I am a fan of unorthodox coding techniques, but this sounds like advocating for cooking pasta in sulfuric acid without explaining what's the point. You lose interrupt safety and subroutine options, and get... what exactly? STX/LDX over zero page are potentially slower than TSX/TXS, but they're so much more flexible that it almost doesn't matter. An NES game will spend most of its runtime with at least NMI on anyway, unless it's, like, terrible, so the safe interval is the sort of thing that can be slow if it wants to. This S misuse definitely doesn't sound like something a beginner should consider.
In a previous post I mentioned that tsx/txs each are 1byte large and each take only 2cycles. Obviously stx/ldx over zeropage are 2bytes large and each takes 3cycles. The extra cycle saved during each time a loop that runs a bunch runs, can make a genuine speed improvement. I’m not advocating S misuse, rather I was pointing out a register that is often overlooked and was trying to explain how to make use of it correctly. The op seems like he is very smart during his learning of programming NES. :)


edit: Oh, yes, you are correct, don’t jsr subroutines while using the stack pointer register. But after the loop is through, and the stack pointer has been restored, then jsr is possible again.

Loops using subroutines avoid txs.

User avatar
zanto
Posts: 57
Joined: Sun Mar 07, 2021 11:15 pm
Location: Rio de Janeiro, Brazil

Re: A newbie's code evaluation

Post by zanto » Tue Mar 09, 2021 8:19 pm

Hi guys! Sorry it took me long to answer. I was working on the next steps of my "game" (it's more like an experiment while I'm learning hahaha). Also real life won't let me have as much time for NES development than I wish I had :(
I updated my demo and it's now possible to create entities on the fly (ex: player shooting bullets). I also moved some code out of the NMI into the regular code structure.
Before I get to what I've done so far, some thoughts about the ideas you guys brought up! :)


----------------------

There was some discussion about the usage of stacks or not. I've been avoiding stacks simply because I read they are very consuming (I didn't even consider the issue with the vblank in this case, so thanks for pointing that out). I'm still afraid of doing anything that sounds too consuming, which maybe isn't the best way to approach things. I mean, if the stack exists, it's probably there because it's helpful. :P

And about the chaining branches VS adapting the code to use JMP, I opted for the latter, because it just seemed like a better way to do it (I think it makes the code cleaner and easier to read). I could change it if there's a more practical reason to do so (like chaining branches consumes less memory and clocks, which it doesn't seem like they do).

----------------------
For example, you won't always be able to use the same index for your current sprite entity array and for the index in the hardware sprites. After all, what if your game can draw arbitrary sprites in an arbitrary order on the screen? Then the offset for your 2x2 tiles entity will always be a value between 0 and 15, but the current index for the hardware sprites can be an index from 0 to 63.
This is one of the issues I faced when implementing a system that would allow me to create entities on the fly (and their respective sprites) without me having to worry about how they are being allocated in memory and stuff. In the end, I had to allocate a variable in the ZP that temporarily stores a value that was in X because I needed it on a loop to find a memory address to store a new sprite. I thought this would be a better solution that pushing and pulling from a stack.

----------------------
Same when you decide to not store your sprites in the typical NES format (y, tile, attributes x) since this might be a waste of memory: The x and y position is variable anyway, so why waste a byte for them in ROM that you never really need?
Oh, that's true! I've been storing all 4 bytes of sprite data on the ROM which is very wasteful considering that I don't need X and Y until the sprite is created on screen. Thanks for the insight! :D

----------------------

I'm attaching a new demo to this post. As I said before, I made a system that allows me to create new entities (e.g. enemies, bullets, animations etc) on the fly and creates a sprite for them and draws them on screen. It seems relatively convenient for me to use, since I won't have to do a bunch of stuff needed for these things manually every time I need something new to appear on screen. In the demo, you can move the player with the d-pad and create bullets with the A button. They don't move yet. There's also a "ball" on screen which is just another entity.

A few things it'd be nice to hear your thoughts about:
- The entity/sprite management system. This will be a big issue in the games that I'd like to develop on the NES, so having a grasp on what's the best approach to this kind of thing would be super helpful!
- Codes that go in the NMI routine or the "normal" code section. As I said before I transferred some code that was in the NMI routine to the normal code block. Should I always make absolutely sure that the NMI routine is 100% optimized or are there situations when it's okay to have normal code there? Should the NMI routine in the source code of my demo be further optimized? If so, how?
- I've been using some JMP commands to skip functions that I don't want to be executed unless they are called from other functions. These functions are kinda like class methods. So you'll see things like "JMP SKIP_ENTITY_FUNCTIONS" or "JMP SKIP_SPRITE_FUNCTIONS". Is this the best way to do this kind of thing?
- My next step is to code a behavior system for each entity (e.g: bullets always go up and get destroyed when they touch an enemy or reach the top of the screen; an enemy that moves left and right and maybe shoots bullets etc). Initially my idea is that each instance of an entity (i.e. each instance of the enemy entity, or each instance of a bullet) would have local variables associated with it (e.g. one variable for the state which defines the kind of behavior it'll do, one for HP, some kind of internal counter, etc). Each type of instance may require a different number of variables and if there are a lot of instances (e.g. 20), it may be too much information to store on the ZP, so I may end up having to use the rest of the RAM (the ZP also stores data that is not related to entities, of course). But if I used RAM, I still wouldn't be sure how to manage memory that gets allocated for a entity that gets created and memory that is freed when an entity gets destroyed, since I don't have anything that can optimize memory use for me. It'd be easier if entities used the same amount of memory, but since that's not the case, it's something I'll have to think about. Too bad assembly doesn't have linked lists implemented by default :P
So how do you usually code this kind of thing in your NES games? It'd be very insightful to hear what more experienced devs have to say about a behavior system! :)

Oh, and one last question: most NES game source examples I've found are written in C. Are there any examples written in ASM? I couldn't find many of those...
Attachments
Secondgame v2.zip
(25.17 KiB) Downloaded 21 times

User avatar
DRW
Posts: 2070
Joined: Sat Sep 07, 2013 2:59 pm

Re: A newbie's code evaluation

Post by DRW » Wed Mar 10, 2021 1:25 am

unregistered wrote:
Tue Mar 09, 2021 10:26 am
Well, the idea, for me, was to find another bne, if you are attempting to chain bne, that’s already in the middle of your code that branches even lower, create a new label to that middle bne, and branch to that label.
None of this changes in any way my statement that it's a dirty hack job. Now you have to remember that your BEQ in the middle of the code is important for some other, unrelated code. And if you ever change it, you have to change other code.

unregistered wrote:
Tue Mar 09, 2021 10:26 am
There are many different ways to allow chaining of branches to work efficiently. :)
Yeah, or you simply use one single JMP and you have clean, robust code.

unregistered wrote:
Tue Mar 09, 2021 10:26 am
Well, the stack pointer definitely sits in a register and using it correctly can create massive speed improvements. It has helped me multiple times too. :)
Really? Massive? If you're that much in need of optimization that you have to rely on this kind of suicidal code style, maybe your code has some other bottlenecks.

unregistered wrote:
Tue Mar 09, 2021 10:26 am
And vblank only runs after frame scanline 240... so, if, say, my stack pointer register code runs around frame scanline 120 and doesn’t take nearly 120 frames scanlines to complete, that’s good enough for me. :)
If you want to program functions based on the pre-knowledge where they appear in the code based on the scanline, so you can never really refactor your code without having to take massive side effects into consideration, then go ahead.

But my point stands: Those are dirty hack jobs that I myself wouldn't approve of.

unregistered wrote:
Tue Mar 09, 2021 11:31 am
I’m not advocating S misuse, rather I was pointing out a register that is often overlooked and was trying to explain how to make use of it correctly.
It isn't "overlooked". It is intended for something completely different than A, X and Y. Advocating the usage of this is exactly what you say: Misuse.

unregistered wrote:
Tue Mar 09, 2021 11:31 am
The op seems like he is very smart during his learning of programming NES. :)
There's a difference between smart programming and doing something just because the hardware allows it. If the usage of a certain feature in a function is dependent on whether the function is called at around scanline 100 or scanline 200, this is the opposite of smart. Smart is if you write your functions robust and free of side effects.
My game "City Trouble": www.denny-r-walter.de/city.htm

User avatar
DRW
Posts: 2070
Joined: Sat Sep 07, 2013 2:59 pm

Re: A newbie's code evaluation

Post by DRW » Wed Mar 10, 2021 2:09 am

zanto wrote:
Tue Mar 09, 2021 8:19 pm
There was some discussion about the usage of stacks or not.
It's not about using stacks or not. It's about absuing the stack pointer register as an additional register that you can use like A, X and Y.

I.e. you can write stuff to the stack and pull stuff from it. What you shouldn't do is manipulating the stack register yourself.
You initialize the stack pointer register once at the beginning. But from then on, you should never write any values there anymore.

I.e. if register S has the number 25 in it, then you don't take the number 25, save it in a variable and then use S to fill with your own temporary counter variables, only to write the 25 back into S at the end.
You don't do this because the 25 is a number that tells the processor where the memory address for the next JSR gets written and where the memory address for the next RTS is located. If you use S for a temporary storage and then call another function, your program flow is dead. If you use S for a temporary storage and the NMI hits, your program flow is dead.

This is what you should never do: Using TXS at any location except for the start. Because that's just insane.

Of course you can still use the stack itself via PHA/PLA.

zanto wrote:
Tue Mar 09, 2021 8:19 pm
I could change it if there's a more practical reason to do so (like chaining branches consumes less memory and clocks, which it doesn't seem like they do).
It might be shorter, but sometimes you need to make a decision: Are you really in such a need to save two or three cycles in a branch instruction, so that you need to make your code dirty?

zanto wrote:
Tue Mar 09, 2021 8:19 pm
The entity/sprite management system. This will be a big issue in the games that I'd like to develop on the NES, so having a grasp on what's the best approach to this kind of thing would be super helpful!
This is in fact something that might be a bit more complicated depending on how you want to show your sprites.
I know you think of a simple loop where you simply blast your data into the hardware sprites, but this might not suffice.

Think of the situation when you want to flip your sprites. Now you don't only flip the hardware sprites themselves. You also have to draw your sprites from right to left.

Or think about this: You can store the palette values for four sprite tiles into one byte since each palette value only requires two bits. Hence, you need a mechanism to shift the palette values to then cut off the lowest bits and apply them to the sprite attributes of the current tile.

The sprite rendering function might become quite long since it might require a lot of logical calculations. But this is not an NES-specific topic. This is just programming in general: How to prepare your sprite data before drawing it into the hardware sprites.

zanto wrote:
Tue Mar 09, 2021 8:19 pm
Codes that go in the NMI routine or the "normal" code section. As I said before I transferred some code that was in the NMI routine to the normal code block. Should I always make absolutely sure that the NMI routine is 100% optimized or are there situations when it's okay to have normal code there?
Logic in the game loop, graphic updates in NMI. But preparation of graphic updates still in the game loop.
This is the go-to article for this kind of topic:
https://wiki.nesdev.com/w/index.php/The_frame_and_NMIs


Regarding the usage of variables:

For entities on screen, I have something like this:

Code: Select all

struct Characters
{
    byte Type[MaxNumberOfCharacters];
    byte X[MaxNumberOfCharacters];
    byte Y[MaxNumberOfCharacters];
    byte FacingDirection[MaxNumberOfCharacters];
    byte Value1[MaxNumberOfCharacters];
    byte Value2[MaxNumberOfCharacters];
    byte Value3[MaxNumberOfCharacters];
    byte Value...[MaxNumberOfCharacters];
} AllCharacters;
I have as many Value variables as the entity type that needs the most values.
And MaxNumberOfCharacters is of course the maximum number of entities that you can have on the screen at once.

If you load, for example, an opponent into slot number 3, you set the Type[3] variable to TypeOpponent, you set its X[3] and Y[3] position. And then you set everything that is opponent-specific to the Value variables.

I use macros to rename the generic variables accordingly:

Code: Select all

#define OppEnergy Value1
#define OppMovementPattern Value2
#define OppMovementPatternIndex Value3

// Weapons:
#define WpnHorizontalMovingDirection Value1
#define WpnVerticalMovingDirection Value2
#define WpnSpeed Value3
All of this can also be done in Assembly code.

Now you have the minimum necessary variables for entities on screen.


For local variables, i.e. variables that are only needed within a function or as function parameters, I did this:
I reserved several groups of variables:
Byte1A, Byte2A, Byte3A etc.
Byte1B, Byte2B, Byte3B etc.
etc.

If a function doesn't call another function, its name gets a postfix "_a". And its local variables are taken from the A group.
If a function calls another function that is named "_a", then the current function uses the B group for its local variables and gets the name "_b". (To make sure that the "_a" function doesn't overwrite the local variables of the "_b" function.)
And so on: Check which functions your current function calls. And put your new function into the next group.

Again: Variable renaming via aliases:

Code: Select all

void MyFunction_b(void)
{
#define i Byte1B
#define max Byte2B

    ...
    i = 5;
    OtherFunction_a();
    ...
    max = 23;
    ...
    
#undef i
#undef max
}
If MyFunction_b ever calls another function that already has a "_b", you need to rename MyFunction_b to MyFunction_c and change the macro definitions to Byte1C. Then let the compiler find the errors (since every occurence of MyFunction_b is now incorrect) and potentially also change the postfixes and variable references of the functions that call MyFunction_b/MyFunction_c.

This requires a bit of manual caution:
If you call a function with a "_c", but your own function already has a "_c" and you forget to update your function name to "_d", you might be in trouble.
Likewise, if you have a "_c" function, but your variable names still point to Byte1B or Byte1D.

But with the postfixes in the names, you can pretty easily check each function individually whether it applies to the standard. There aren't any global side effects to keep in mind. Every manual inspection is confined within each separate function.
So, if you have validated a single function as being named correctly and as using the correct group of variables and you don't change that function anymore, then no change anywhere else in the code can invalidate this already validated function.
My game "City Trouble": www.denny-r-walter.de/city.htm

Post Reply