Best practices for instancing?

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

pwnskar
Posts: 119
Joined: Tue Oct 16, 2018 5:46 am
Location: Gothenburg, Sweden

Best practices for instancing?

Post by pwnskar »

Hey! I'm curious about different techniques to tackle instanced entities on the 6502.

Using asm6 where there are no structs, I've come up with my own solution where I define constants for the index of the properties of my data types:

Code: Select all

PROJECTILE_STATE		= 0
PROJECTILE_POS_X_LO	= 1
PROJECTILE_POS_X_HI	= 2
PROJECTILE_POS_Y_LO	= 3
PROJECTILE_POS_Y_HI	= 4
PROJECTILE_...
I then have a constant defining what page my projectiles will be on and use that to loop through all of them.

Code: Select all

lda #PROJECTILES_RAM+#PROJECTILE_STATE, x
; do some game logic

lda #PROJECTILES_RAM+#PROJECTILE_POS_X_LO, x
; do some other game logic
this approach works for me but when I want to deal with data types that don't have a whole page reserved to them and also might cross pages, I end up writing a lot of code indexing with indirect addresses.

Code: Select all

lda #<(player1)
ldx #>(player1)

jsr CheckSomeCollision

...

CheckSomeCollision:

sta curr_pointer_lo
stx curr_pointer_hi

ldy #0

...

lda curr_pointer_lo
clc
adc #PLAYER_POS_X_LO
sta curr_pointer_lo
lda curr_pointer_hi
adc #0
sta curr_pointer_hi

lda (curr_pointer), y
; do some stuff

...

rts
Depending on how complex logic I need to do, I can end up adding and subtracting from those curr_pointer values a lot and I can't help but wonder if I'm doing it right. This approach must be quite performance expensive?

So another approach I've tried is to have variables dedicated to be parameters or temp values:

Code: Select all

lda player1_state
sta curr_player_state
lda player1_posx_lo
sta curr_player_posx_lo
lda player1_posx_hi
sta curr_player_posx_hi
; I have quite a few variables to do this with

jsr UpdatePlayer

; Now I have to pass the values back to the correct player variables

lda curr_player_state
sta player1_state
lda curr_player_posx_lo
sta player1_posx_lo
lda curr_player_posx_hi
sta player1_posx_hi
; and so on...
This approach works but I end up basically having to reserve RAM for three players when there are actually only two playable ones in the game. One tradeoff though, is that I can reuse my "curr_player_" variables for other datatypes before or after I update my players.

My approaches to this works but having no previous experience in assembly and also failing to find anything covering this subject on google, I'm really curious about how other people here on the forum tackle this on the 6502.

Cheers!
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Best practices for instancing?

Post by tepples »

The usual solution on 6502 is to make parallel arrays, one for each byte of properties of an entity. Then you can set X to, say, 3 to access all properties of entity ID 3.

Code: Select all

NUM_ACTORS equ 16

actor_xsub: dsb NUM_ACTORS  ; low byte of 24-bit coordinate
actor_x: dsb NUM_ACTORS
actor_xscr: dsb NUM_ACTORS  ; high byte
actor_ysub: dsb NUM_ACTORS
actor_y: dsb NUM_ACTORS
actor_yscr: dsb NUM_ACTORS
actor_facing: dsb NUM_ACTORS
actor_frame: dsb NUM_ACTORS
actor_frame_sub: dsb NUM_ACTORS
actor_health: dsb NUM_ACTORS
Drakim
Posts: 97
Joined: Mon Apr 04, 2016 3:19 am

Re: Best practices for instancing?

Post by Drakim »

One trick that's quite common is to store your entities live values in "parallel" rather than as blobs. So instead of having projectile 1's state, x_hi, x_lo, y_hi, y_lo.... and then projectile 2's state x_hi, x_lo, y_hi, y_lo.... instead you store it like this:

Code: Select all

    proj_state:               DSB 30
    proj_x_hi:                DSB 30
    proj_x_lo:                DSB 30
    proj_y_hi:                DSB 30
    proj_y_lo:                DSB 30
So here I've allocated for up to 30 projectiles at the same time in my RAM. But instead of having each projectile state bunched together into a blob, they are all mixed up in parallel. To read out one specific projectile's values you use the X register to offset:

Code: Select all

    LDX #5             ; We want to look at the 6th projectile
    LDA proj_x_lo,X    ; Load the 6th projectile's low x position
Thus, when you want to call JSR UpdatePlayer, you just make sure that the X register points to the "slot" of the player you want to update. A huge advantage is that you have much more control over where and how your memory is located, you can make sure you always stay within one page, and you can have up to 256 projectiles before you run into trouble.

Edit: tepples beat me to it :beer:
pwnskar
Posts: 119
Joined: Tue Oct 16, 2018 5:46 am
Location: Gothenburg, Sweden

Re: Best practices for instancing?

Post by pwnskar »

Drakim wrote:One trick that's quite common is to store your entities live values in "parallel" rather than as blobs. So instead of having projectile 1's state, x_hi, x_lo, y_hi, y_lo.... and then projectile 2's state x_hi, x_lo, y_hi, y_lo.... instead you store it like this:

Code: Select all

    proj_state:               DSB 30
    proj_x_hi:                DSB 30
    proj_x_lo:                DSB 30
    proj_y_hi:                DSB 30
    proj_y_lo:                DSB 30
So here I've allocated for up to 30 projectiles at the same time in my RAM. But instead of having each projectile state bunched together into a blob, they are all mixed up in parallel. To read out one specific projectile's values you use the X register to offset:

Code: Select all

    LDX #5             ; We want to look at the 6th projectile
    LDA proj_x_lo,X    ; Load the 6th projectile's low x position
Thus, when you want to call JSR UpdatePlayer, you just make sure that the X register points to the "slot" of the player you want to update. A huge advantage is that you have much more control over where and how your memory is located, you can make sure you always stay within one page, and you can have up to 256 projectiles before you run into trouble.

Edit: tepples beat me to it :beer:
I think I follow what you're saying, and I'm looking forward to what improvements I can make to my code by storing values in parallel. One thing I guess I have to make sure is that none of my properties cross a page. For that I guess I would have to use indirect addressing with the y register.

The approach you guys described feels very similar to what I'm doing in my first example but I totally get how it will make things easier to be able to address an entity by it's id rather than something like ldx PLAYER_1_ENTITY_ID*PLAYER_ENTITY_SIZE.

EDIT: You wouldn't happen to know if asm6 has any functionality to warn you if a labeled variable is crossing a page? That would be helpful in this case.

Cheers!
Drakim
Posts: 97
Joined: Mon Apr 04, 2016 3:19 am

Re: Best practices for instancing?

Post by Drakim »

pwnskar wrote:You wouldn't happen to know if asm6 has any functionality to warn you if a labeled variable is crossing a page? That would be helpful in this case.
I don't think so, but it shouldn't be too hard to write an ASM6 macro that does this for you, something like this maybe:

Code: Select all

MACRO CHECKSAMEPAGE InputLabel
  IF >InputLabel != >$
    ERROR "This is not the same page"
  ENDIF
ENDM
Usable like this:

Code: Select all

    MyArray:      DSB 50
    CHECKSAMEPAGE MyArray
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Best practices for instancing?

Post by tepples »

pwnskar wrote:One thing I guess I have to make sure is that none of my properties cross a page.
Why? An indexed read with page crossing costs one cycle. If you're just barely hitting lag frames, and your profiler says you're hitting a lot of penalties from indexed reads that cross a page, then you can align variables later. But at that point, it might be to move your most often accessed variables to zero page, where you save 1 cycle for writing as well.
pwnskar
Posts: 119
Joined: Tue Oct 16, 2018 5:46 am
Location: Gothenburg, Sweden

Re: Best practices for instancing?

Post by pwnskar »

tepples wrote:
pwnskar wrote:One thing I guess I have to make sure is that none of my properties cross a page.
Why? An indexed read with page crossing costs one cycle. If you're just barely hitting lag frames, and your profiler says you're hitting a lot of penalties from indexed reads that cross a page, then you can align variables later. But at that point, it might be to move your most often accessed variables to zero page, where you save 1 cycle for writing as well.
Oh, I was under the impression that crossing a page while indexing would loop you back to the top of the page?

Code: Select all

ldx #$10
lda $00ff, x    ; = lda $000f
But I guess the case is actually:

Code: Select all

ldx #$10
lda $00ff, x    ; = lda $010f
But that doesn't work with indirect addressing? I remember trying something like this

Code: Select all

; pointer_lo = $ff
; pointer_hi = $00

ldy #$10
lda (pointer_lo), y
and that would result in me getting the value of $000f rather than $010f.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Best practices for instancing?

Post by tepples »

Wrapping within a page happens in three cases:
  1. Zero page indexed modes wrap within page $00: ldx #$10 lda $fc,x will access $000C. So does the rarely used indexed indirect mode (dd,X).
  2. Stack operations wrap within page $01.
  3. JMP indirect with the address at $xxFF retrieves the high byte from the same page instead of the next.
Both absolute indexed aaaa,X and indirect indexed (dd),Y will cross pages with a 1-cycle penalty on read. This penalty is applied to all absolute indexed and indirect indexed writes, whether or not they cross a page.

Cycle-by-cycle operations for lda aaaa,X that does not cross pages
  1. Read lda aaaa,X opcode
  2. Read low byte of address
  3. Read high byte of address while adding X to the low byte
  4. Read uncorrected address
Cycle-by-cycle operations for lda aaaa,X that crosses pages
  1. Read lda aaaa,X opcode
  2. Read low byte of address
  3. Read high byte of address while adding X to the low byte
  4. Read uncorrected address, such as $03FC + $08 = $030C, while realizing "oh sh-- there's a carry"
  5. Read carry-adjusted address
Cycle-by-cycle operations for lda (dd),Y that does not cross pages
  1. Read lda aaaa,X opcode
  2. Read address of pointer from zero page
  3. Read low byte of address
  4. Read high byte of address while adding X to the low byte
  5. Read uncorrected address
Cycle-by-cycle operations for lda (dd),Y that crosses pages
  1. Read lda aaaa,X opcode
  2. Read address of pointer from zero page
  3. Read low byte of address
  4. Read high byte of address while adding X to the low byte
  5. Read uncorrected address while realizing "oh sh-- there's a carry"
  6. Read carry-adjusted address
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Best practices for instancing?

Post by tokumaru »

pwnskar wrote:One thing I guess I have to make sure is that none of my properties cross a page.
The penalty for crossing a page is just one cycle, not a big deal, but if you absolutely need to avoid page crossing at all costs, the good thing about this method is that different properties don't need to be contiguous in RAM, so you can freely mix the property arrays with other variables in whatever order you want.
For that I guess I would have to use indirect addressing with the y register.
Why? If you start using indirect addressing and pointer manipulation you're basically killing off all the advantages of using parallel arrays.
EDIT: You wouldn't happen to know if asm6 has any functionality to warn you if a labeled variable is crossing a page?
Nothing built-in, but you can probably make a macro. Something like this:

Code: Select all

.macro CheckPage Address
  ;if the high byte of the specified address doesn't match the high byte of the current address
  .if >Address != >$
    .error "Page crossed!"
  .endif
.endm
Which you can use like this:

Code: Select all

MyArray .dsb 30
CheckPage MyArray
If you don't want to "CheckPage" after every declaration you can create another macro that will reserve the bytes AND check for page crossing in one go:

Code: Select all

.macro FastArray Size
  Start:  .dsb Size
  CheckPage Start
.endm
Which you use like this:

Code: Select all

MyArray FastArray 30
pwnskar wrote:Oh, I was under the impression that crossing a page while indexing would loop you back to the top of the page?
Only with ZP indexed addressing, but you normally wouldn't have big arrays in ZP anyway.

Code: Select all

; pointer_lo = $ff
; pointer_hi = $00

ldy #$10
lda (pointer_lo), y
and that would result in me getting the value of $000f rather than $010f.
That's not how it works, you should be getting the value at $010f. Wrapping only occurs in ZP indexed addressing.
pwnskar
Posts: 119
Joined: Tue Oct 16, 2018 5:46 am
Location: Gothenburg, Sweden

Re: Best practices for instancing?

Post by pwnskar »

Thank you so much for the replies, guys! You're straightening out so much of this that I had totally misunderstood. I'm having high hopes of making my code smaller, more efficient and more human readable with this knowledge.

Cheers!
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Best practices for instancing?

Post by tokumaru »

I just realized that my page crossing macro would be off by one when testing arrays, because it's testing the address immediately after the array, but you get the idea.
User avatar
thefox
Posts: 3134
Joined: Mon Jan 03, 2005 10:36 am
Location: 🇫🇮
Contact:

Re: Best practices for instancing?

Post by thefox »

pwnskar wrote:

Code: Select all

lda #PROJECTILES_RAM+#PROJECTILE_STATE, x[/quote]
OT, but what's up with that second "#". Does this code compile in your assembler of choice?
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi
pwnskar
Posts: 119
Joined: Tue Oct 16, 2018 5:46 am
Location: Gothenburg, Sweden

Re: Best practices for instancing?

Post by pwnskar »

thefox wrote:
pwnskar wrote:

Code: Select all

lda #PROJECTILES_RAM+#PROJECTILE_STATE, x[/quote]
OT, but what's up with that second "#". Does this code compile in your assembler of choice?[/quote]

It might not. It's not a copy paste from my actual project, just an example I wrote here. Still being quite green, I sometimes forget when the # is needed or not. :)

Cheers!
User avatar
koitsu
Posts: 4201
Joined: Sun Sep 19, 2004 9:28 pm
Location: A world gone mad

Re: Best practices for instancing?

Post by koitsu »

The # is needed when you want to tell the assembler/6502 to use an immediate value (think "literal number") in the operand itself, rather than "an address (in RAM or ROM) to read/write to/from". It's not a prefix used to represent base of numbers (e.g. $ for hexadecimal or % for binary), or "types" of numbers on a per-number or per-variable basis. Better explained in code with comments:

Code: Select all

FOO   = $01
BAR   = $02
UGH   = $0a
DERP  = %11111110
HOBER = 26

lda FOO           ; equivalent to lda $01 -- reads value from RAM location $01 and puts it into the accumulator -- assembles to a5 01
lda #FOO          ; equivalent to lda #$01 -- puts the literal value 1 ($01) into the accumulator -- assembles to a9 01

lda FOO+BAR       ; equivalent to lda $01+$02, i.e. lda $03 -- assembles to a5 03
lda #FOO+BAR      ; equivalent to lda #$01+$02, i.e. lda #$03 -- assembles to a9 03

lda DERP          ; equivalent to lda %11111110 or lda $fe -- assembles to a5 fe
lda #DERP         ; equivalent to lda #%11111110 or lda #$fe -- assembles to a9 fe

lda HOBER         ; equivalent to lda 26 or lda $1a -- assembles to a5 1a
lda #HOBER        ; equivalent to lda #26 or lda #$1a -- assembles to a9 1a

lda HOBER*2       ; equivalent to lda 26*2 or lda 52 or lda $1a*2 or lda $34 -- assembles to a5 34
lda HOBER*100     ; equivalent to lda 26*100 or lda 2600 or lda $1a*100 or lda $0a28 -- assembles to ad 28 0a
lda #HOBER*100    ; equivalent to lda #26*100 or lda #2600 -- will fail to assemble because 2600 is too large; values can only be 0-255 (8-bit)
lda $HOBER        ; undetermined -- behaviour may vary per assembler depending on parser; might do lda $26 or might do lda $1a or might throw an error
lda #$HOBER       ; undetermined -- behaviour may vary per assembler depending on parser; might do lda #$26 or might do lda #$1a or might throw an error

lda #(UGH+16)*8   ; let's expand variables, use decimal rather than hexadecimal, and break the math down:
                  ; ($0a+16)*8 == (10+16)*8 == 26*8 == 208
                  ; thus this is the equivalent to lda #208 or lda #$d0 -- assembles to a9 d0

lda ((UGH+3)*$1000)+10   ; let's expand variables and break the math down piece by piece, in order of operation, and do some base conversion:
                         ; ($0a+3) == (10+3) == 13 == $0d
                         ; ($0d*$1000) == $d000
                         ; $d000+10 == $d00a
                         ; thus this is the equivalent to lda $d00a -- assembles to ad 0a d0

It's important to understand two things here (#1 may help you):

1. The CPU uses completely different instructions/opcodes depending on what addressing mode you're using. For example, lda #$05 uses immediate addressing, which assembles to bytes a9 05. But lda $05 would use zero page addressing and assemble to a5 05 -- note the difference of the first byte! Same goes for absolute addressing, where lda $1234 would assemble to ad 34 12.

2. The mathematics you see above is being done at assemble-time, calculated the assembler and NOT done at run-time by the CPU. There's a world of difference between the two. Doing mathematics on the 6502 at run-time is a substantially more involved process (simple addition/subtraction is easy, multiplying/dividing by 2/4/8/16/32/64/128 is easy, anything else is much more advanced).

Does this help?

One thing that will confuse you: you'll see parenthesis () used for mathematical order-of-operation, but you'll also see it in instructions like lda ($16),y. The latter isn't assemble-time mathematics being done by the assembler -- it's real/actual 6502 code using a form of indirect addressing (already discussed). This can add to some confusion as I'm sure you can imagine.

"So how does the assembler know when to use () for instructions and when to use it for math?" It varies per assembler, and you have to read the assembler's documentation to get a feel for it. Some assemblers like NESASM actually use brackets [] to represent the instruction-level addressing mode, and uses parenthesis purely for assemble-time mathematics. Other assemblers are smart/intelligent enough to know what you want.

If this paragraph is confusing, then I'll make it simple: in most assemblers you'd say lda ($16),y or lda ($16,x) or jmp ($1234) to do indirect addressing, while in NESASM you'd need to write lda [$16],y or lda [$16,x] or jmp [$1234]. NESASM tends to be "the odd man out" in this respect, and this major difference can cause a lot of problems for both newbies and experienced programmers.
pwnskar
Posts: 119
Joined: Tue Oct 16, 2018 5:46 am
Location: Gothenburg, Sweden

Re: Best practices for instancing?

Post by pwnskar »

Thank you for that detailed breakdown, koitsu! Much appreciated! Yeah, for now I'm not doing too much complicated compile time math, mostly just simple adds or subtracts.

BTW, I've started converting my player variables to be stored in parallel and it's gone well so far, just a lot of code to refactor. But I've nearly converted everything concerning the players and will do the same for projectiles and platforms.

I have some routines that do a lot of indirect addressing with pointers that I've not started to convert yet. What I'm doing is:

Code: Select all

ldy #0
lda (pointer_lo), y
sta val1

inc pointer_lo
bne @dont_inc_hi_ptr
inc pointer_hi
@dont_inc_hi_ptr:

lda (pointer_lo), y
sta val2

;... and so on
Whereas from what I can tell i should be able to just increment y, as long as y never exceeds 255?

Code: Select all

ldy #0
lda (pointer_lo), y
sta val1

iny
lda (pointer_lo), y
sta val2

;... and so on
I'll do a little test next time I sit down with my project.

Code: Select all

ldy #10
lda ($90ff), y
If I've understood correctly, this would give me the same result as:

Code: Select all

lda $910f
If so, this should improve the speed of a lot of my OAM buffering and nametable updates.
Post Reply