It is currently Tue Nov 20, 2018 4:44 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 25 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Mon Oct 22, 2018 4:47 am 
Offline

Joined: Tue Oct 16, 2018 5:46 am
Posts: 22
Hey! I'm curious about different techniques to tackle instanced entities on the 6502.

Using asm6 where there are no structs, I've come up with my own solution where I define constants for the index of the properties of my data types:

Code:
PROJECTILE_STATE      = 0
PROJECTILE_POS_X_LO   = 1
PROJECTILE_POS_X_HI   = 2
PROJECTILE_POS_Y_LO   = 3
PROJECTILE_POS_Y_HI   = 4
PROJECTILE_...


I then have a constant defining what page my projectiles will be on and use that to loop through all of them.

Code:
lda #PROJECTILES_RAM+#PROJECTILE_STATE, x
; do some game logic

lda #PROJECTILES_RAM+#PROJECTILE_POS_X_LO, x
; do some other game logic


this approach works for me but when I want to deal with data types that don't have a whole page reserved to them and also might cross pages, I end up writing a lot of code indexing with indirect addresses.

Code:
lda #<(player1)
ldx #>(player1)

jsr CheckSomeCollision

...

CheckSomeCollision:

sta curr_pointer_lo
stx curr_pointer_hi

ldy #0

...

lda curr_pointer_lo
clc
adc #PLAYER_POS_X_LO
sta curr_pointer_lo
lda curr_pointer_hi
adc #0
sta curr_pointer_hi

lda (curr_pointer), y
; do some stuff

...

rts


Depending on how complex logic I need to do, I can end up adding and subtracting from those curr_pointer values a lot and I can't help but wonder if I'm doing it right. This approach must be quite performance expensive?

So another approach I've tried is to have variables dedicated to be parameters or temp values:

Code:
lda player1_state
sta curr_player_state
lda player1_posx_lo
sta curr_player_posx_lo
lda player1_posx_hi
sta curr_player_posx_hi
; I have quite a few variables to do this with

jsr UpdatePlayer

; Now I have to pass the values back to the correct player variables

lda curr_player_state
sta player1_state
lda curr_player_posx_lo
sta player1_posx_lo
lda curr_player_posx_hi
sta player1_posx_hi
; and so on...


This approach works but I end up basically having to reserve RAM for three players when there are actually only two playable ones in the game. One tradeoff though, is that I can reuse my "curr_player_" variables for other datatypes before or after I update my players.

My approaches to this works but having no previous experience in assembly and also failing to find anything covering this subject on google, I'm really curious about how other people here on the forum tackle this on the 6502.

Cheers!


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 5:31 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20789
Location: NE Indiana, USA (NTSC)
The usual solution on 6502 is to make parallel arrays, one for each byte of properties of an entity. Then you can set X to, say, 3 to access all properties of entity ID 3.
Code:
NUM_ACTORS equ 16

actor_xsub: dsb NUM_ACTORS  ; low byte of 24-bit coordinate
actor_x: dsb NUM_ACTORS
actor_xscr: dsb NUM_ACTORS  ; high byte
actor_ysub: dsb NUM_ACTORS
actor_y: dsb NUM_ACTORS
actor_yscr: dsb NUM_ACTORS
actor_facing: dsb NUM_ACTORS
actor_frame: dsb NUM_ACTORS
actor_frame_sub: dsb NUM_ACTORS
actor_health: dsb NUM_ACTORS


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 5:45 am 
Offline

Joined: Mon Apr 04, 2016 3:19 am
Posts: 85
One trick that's quite common is to store your entities live values in "parallel" rather than as blobs. So instead of having projectile 1's state, x_hi, x_lo, y_hi, y_lo.... and then projectile 2's state x_hi, x_lo, y_hi, y_lo.... instead you store it like this:

Code:
    proj_state:               DSB 30
    proj_x_hi:                DSB 30
    proj_x_lo:                DSB 30
    proj_y_hi:                DSB 30
    proj_y_lo:                DSB 30


So here I've allocated for up to 30 projectiles at the same time in my RAM. But instead of having each projectile state bunched together into a blob, they are all mixed up in parallel. To read out one specific projectile's values you use the X register to offset:

Code:
    LDX #5             ; We want to look at the 6th projectile
    LDA proj_x_lo,X    ; Load the 6th projectile's low x position


Thus, when you want to call JSR UpdatePlayer, you just make sure that the X register points to the "slot" of the player you want to update. A huge advantage is that you have much more control over where and how your memory is located, you can make sure you always stay within one page, and you can have up to 256 projectiles before you run into trouble.

Edit: tepples beat me to it :beer:


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 6:04 am 
Offline

Joined: Tue Oct 16, 2018 5:46 am
Posts: 22
Drakim wrote:
One trick that's quite common is to store your entities live values in "parallel" rather than as blobs. So instead of having projectile 1's state, x_hi, x_lo, y_hi, y_lo.... and then projectile 2's state x_hi, x_lo, y_hi, y_lo.... instead you store it like this:

Code:
    proj_state:               DSB 30
    proj_x_hi:                DSB 30
    proj_x_lo:                DSB 30
    proj_y_hi:                DSB 30
    proj_y_lo:                DSB 30


So here I've allocated for up to 30 projectiles at the same time in my RAM. But instead of having each projectile state bunched together into a blob, they are all mixed up in parallel. To read out one specific projectile's values you use the X register to offset:

Code:
    LDX #5             ; We want to look at the 6th projectile
    LDA proj_x_lo,X    ; Load the 6th projectile's low x position


Thus, when you want to call JSR UpdatePlayer, you just make sure that the X register points to the "slot" of the player you want to update. A huge advantage is that you have much more control over where and how your memory is located, you can make sure you always stay within one page, and you can have up to 256 projectiles before you run into trouble.

Edit: tepples beat me to it :beer:


I think I follow what you're saying, and I'm looking forward to what improvements I can make to my code by storing values in parallel. One thing I guess I have to make sure is that none of my properties cross a page. For that I guess I would have to use indirect addressing with the y register.

The approach you guys described feels very similar to what I'm doing in my first example but I totally get how it will make things easier to be able to address an entity by it's id rather than something like ldx PLAYER_1_ENTITY_ID*PLAYER_ENTITY_SIZE.

EDIT: You wouldn't happen to know if asm6 has any functionality to warn you if a labeled variable is crossing a page? That would be helpful in this case.

Cheers!


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 6:38 am 
Offline

Joined: Mon Apr 04, 2016 3:19 am
Posts: 85
pwnskar wrote:
You wouldn't happen to know if asm6 has any functionality to warn you if a labeled variable is crossing a page? That would be helpful in this case.


I don't think so, but it shouldn't be too hard to write an ASM6 macro that does this for you, something like this maybe:

Code:
MACRO CHECKSAMEPAGE InputLabel
  IF >InputLabel != >$
    ERROR "This is not the same page"
  ENDIF
ENDM


Usable like this:

Code:
    MyArray:      DSB 50
    CHECKSAMEPAGE MyArray


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 7:02 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20789
Location: NE Indiana, USA (NTSC)
pwnskar wrote:
One thing I guess I have to make sure is that none of my properties cross a page.

Why? An indexed read with page crossing costs one cycle. If you're just barely hitting lag frames, and your profiler says you're hitting a lot of penalties from indexed reads that cross a page, then you can align variables later. But at that point, it might be to move your most often accessed variables to zero page, where you save 1 cycle for writing as well.


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 7:30 am 
Offline

Joined: Tue Oct 16, 2018 5:46 am
Posts: 22
tepples wrote:
pwnskar wrote:
One thing I guess I have to make sure is that none of my properties cross a page.

Why? An indexed read with page crossing costs one cycle. If you're just barely hitting lag frames, and your profiler says you're hitting a lot of penalties from indexed reads that cross a page, then you can align variables later. But at that point, it might be to move your most often accessed variables to zero page, where you save 1 cycle for writing as well.


Oh, I was under the impression that crossing a page while indexing would loop you back to the top of the page?

Code:
ldx #$10
lda $00ff, x    ; = lda $000f


But I guess the case is actually:

Code:
ldx #$10
lda $00ff, x    ; = lda $010f


But that doesn't work with indirect addressing? I remember trying something like this

Code:
; pointer_lo = $ff
; pointer_hi = $00

ldy #$10
lda (pointer_lo), y


and that would result in me getting the value of $000f rather than $010f.


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 7:34 am 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 20789
Location: NE Indiana, USA (NTSC)
Wrapping within a page happens in three cases:

  1. Zero page indexed modes wrap within page $00: ldx #$10 lda $fc,x will access $000C. So does the rarely used indexed indirect mode (dd,X).
  2. Stack operations wrap within page $01.
  3. JMP indirect with the address at $xxFF retrieves the high byte from the same page instead of the next.

Both absolute indexed aaaa,X and indirect indexed (dd),Y will cross pages with a 1-cycle penalty on read. This penalty is applied to all absolute indexed and indirect indexed writes, whether or not they cross a page.

Cycle-by-cycle operations for lda aaaa,X that does not cross pages

  1. Read lda aaaa,X opcode
  2. Read low byte of address
  3. Read high byte of address while adding X to the low byte
  4. Read uncorrected address

Cycle-by-cycle operations for lda aaaa,X that crosses pages

  1. Read lda aaaa,X opcode
  2. Read low byte of address
  3. Read high byte of address while adding X to the low byte
  4. Read uncorrected address, such as $03FC + $08 = $030C, while realizing "oh sh-- there's a carry"
  5. Read carry-adjusted address

Cycle-by-cycle operations for lda (dd),Y that does not cross pages

  1. Read lda aaaa,X opcode
  2. Read address of pointer from zero page
  3. Read low byte of address
  4. Read high byte of address while adding X to the low byte
  5. Read uncorrected address

Cycle-by-cycle operations for lda (dd),Y that crosses pages

  1. Read lda aaaa,X opcode
  2. Read address of pointer from zero page
  3. Read low byte of address
  4. Read high byte of address while adding X to the low byte
  5. Read uncorrected address while realizing "oh sh-- there's a carry"
  6. Read carry-adjusted address


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 7:37 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10977
Location: Rio de Janeiro - Brazil
pwnskar wrote:
One thing I guess I have to make sure is that none of my properties cross a page.

The penalty for crossing a page is just one cycle, not a big deal, but if you absolutely need to avoid page crossing at all costs, the good thing about this method is that different properties don't need to be contiguous in RAM, so you can freely mix the property arrays with other variables in whatever order you want.

Quote:
For that I guess I would have to use indirect addressing with the y register.

Why? If you start using indirect addressing and pointer manipulation you're basically killing off all the advantages of using parallel arrays.

Quote:
EDIT: You wouldn't happen to know if asm6 has any functionality to warn you if a labeled variable is crossing a page?

Nothing built-in, but you can probably make a macro. Something like this:

Code:
.macro CheckPage Address
  ;if the high byte of the specified address doesn't match the high byte of the current address
  .if >Address != >$
    .error "Page crossed!"
  .endif
.endm

Which you can use like this:

Code:
MyArray .dsb 30
CheckPage MyArray

If you don't want to "CheckPage" after every declaration you can create another macro that will reserve the bytes AND check for page crossing in one go:

Code:
.macro FastArray Size
  Start:  .dsb Size
  CheckPage Start
.endm

Which you use like this:

Code:
MyArray FastArray 30


pwnskar wrote:
Oh, I was under the impression that crossing a page while indexing would loop you back to the top of the page?

Only with ZP indexed addressing, but you normally wouldn't have big arrays in ZP anyway.

Quote:
Code:
; pointer_lo = $ff
; pointer_hi = $00

ldy #$10
lda (pointer_lo), y


and that would result in me getting the value of $000f rather than $010f.

That's not how it works, you should be getting the value at $010f. Wrapping only occurs in ZP indexed addressing.


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 8:05 am 
Offline

Joined: Tue Oct 16, 2018 5:46 am
Posts: 22
Thank you so much for the replies, guys! You're straightening out so much of this that I had totally misunderstood. I'm having high hopes of making my code smaller, more efficient and more human readable with this knowledge.

Cheers!


Top
 Profile  
 
PostPosted: Mon Oct 22, 2018 8:20 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10977
Location: Rio de Janeiro - Brazil
I just realized that my page crossing macro would be off by one when testing arrays, because it's testing the address immediately after the array, but you get the idea.


Top
 Profile  
 
PostPosted: Thu Oct 25, 2018 10:09 am 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 3138
Location: Tampere, Finland
pwnskar wrote:
[code]lda #PROJECTILES_RAM+#PROJECTILE_STATE, x

OT, but what's up with that second "#". Does this code compile in your assembler of choice?

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: fo.aspekt.fi


Top
 Profile  
 
PostPosted: Fri Oct 26, 2018 12:30 am 
Offline

Joined: Tue Oct 16, 2018 5:46 am
Posts: 22
thefox wrote:
pwnskar wrote:
[code]lda #PROJECTILES_RAM+#PROJECTILE_STATE, x

OT, but what's up with that second "#". Does this code compile in your assembler of choice?


It might not. It's not a copy paste from my actual project, just an example I wrote here. Still being quite green, I sometimes forget when the # is needed or not. :)

Cheers!


Top
 Profile  
 
PostPosted: Fri Oct 26, 2018 1:38 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3694
Location: Mountain View, CA
The # is needed when you want to tell the assembler/6502 to use an immediate value (think "literal number") in the operand itself, rather than "an address (in RAM or ROM) to read/write to/from". It's not a prefix used to represent base of numbers (e.g. $ for hexadecimal or % for binary), or "types" of numbers on a per-number or per-variable basis. Better explained in code with comments:

Code:
FOO   = $01
BAR   = $02
UGH   = $0a
DERP  = %11111110
HOBER = 26

lda FOO           ; equivalent to lda $01 -- reads value from RAM location $01 and puts it into the accumulator -- assembles to a5 01
lda #FOO          ; equivalent to lda #$01 -- puts the literal value 1 ($01) into the accumulator -- assembles to a9 01

lda FOO+BAR       ; equivalent to lda $01+$02, i.e. lda $03 -- assembles to a5 03
lda #FOO+BAR      ; equivalent to lda #$01+$02, i.e. lda #$03 -- assembles to a9 03

lda DERP          ; equivalent to lda %11111110 or lda $fe -- assembles to a5 fe
lda #DERP         ; equivalent to lda #%11111110 or lda #$fe -- assembles to a9 fe

lda HOBER         ; equivalent to lda 26 or lda $1a -- assembles to a5 1a
lda #HOBER        ; equivalent to lda #26 or lda #$1a -- assembles to a9 1a

lda HOBER*2       ; equivalent to lda 26*2 or lda 52 or lda $1a*2 or lda $34 -- assembles to a5 34
lda HOBER*100     ; equivalent to lda 26*100 or lda 2600 or lda $1a*100 or lda $0a28 -- assembles to ad 28 0a
lda #HOBER*100    ; equivalent to lda #26*100 or lda #2600 -- will fail to assemble because 2600 is too large; values can only be 0-255 (8-bit)
lda $HOBER        ; undetermined -- behaviour may vary per assembler depending on parser; might do lda $26 or might do lda $1a or might throw an error
lda #$HOBER       ; undetermined -- behaviour may vary per assembler depending on parser; might do lda #$26 or might do lda #$1a or might throw an error

lda #(UGH+16)*8   ; let's expand variables, use decimal rather than hexadecimal, and break the math down:
                  ; ($0a+16)*8 == (10+16)*8 == 26*8 == 208
                  ; thus this is the equivalent to lda #208 or lda #$d0 -- assembles to a9 d0

lda ((UGH+3)*$1000)+10   ; let's expand variables and break the math down piece by piece, in order of operation, and do some base conversion:
                         ; ($0a+3) == (10+3) == 13 == $0d
                         ; ($0d*$1000) == $d000
                         ; $d000+10 == $d00a
                         ; thus this is the equivalent to lda $d00a -- assembles to ad 0a d0


It's important to understand two things here (#1 may help you):

1. The CPU uses completely different instructions/opcodes depending on what addressing mode you're using. For example, lda #$05 uses immediate addressing, which assembles to bytes a9 05. But lda $05 would use zero page addressing and assemble to a5 05 -- note the difference of the first byte! Same goes for absolute addressing, where lda $1234 would assemble to ad 34 12.

2. The mathematics you see above is being done at assemble-time, calculated the assembler and NOT done at run-time by the CPU. There's a world of difference between the two. Doing mathematics on the 6502 at run-time is a substantially more involved process (simple addition/subtraction is easy, multiplying/dividing by 2/4/8/16/32/64/128 is easy, anything else is much more advanced).

Does this help?

One thing that will confuse you: you'll see parenthesis () used for mathematical order-of-operation, but you'll also see it in instructions like lda ($16),y. The latter isn't assemble-time mathematics being done by the assembler -- it's real/actual 6502 code using a form of indirect addressing (already discussed). This can add to some confusion as I'm sure you can imagine.

"So how does the assembler know when to use () for instructions and when to use it for math?" It varies per assembler, and you have to read the assembler's documentation to get a feel for it. Some assemblers like NESASM actually use brackets [] to represent the instruction-level addressing mode, and uses parenthesis purely for assemble-time mathematics. Other assemblers are smart/intelligent enough to know what you want.

If this paragraph is confusing, then I'll make it simple: in most assemblers you'd say lda ($16),y or lda ($16,x) or jmp ($1234) to do indirect addressing, while in NESASM you'd need to write lda [$16],y or lda [$16,x] or jmp [$1234]. NESASM tends to be "the odd man out" in this respect, and this major difference can cause a lot of problems for both newbies and experienced programmers.


Top
 Profile  
 
PostPosted: Fri Oct 26, 2018 3:53 am 
Offline

Joined: Tue Oct 16, 2018 5:46 am
Posts: 22
Thank you for that detailed breakdown, koitsu! Much appreciated! Yeah, for now I'm not doing too much complicated compile time math, mostly just simple adds or subtracts.

BTW, I've started converting my player variables to be stored in parallel and it's gone well so far, just a lot of code to refactor. But I've nearly converted everything concerning the players and will do the same for projectiles and platforms.

I have some routines that do a lot of indirect addressing with pointers that I've not started to convert yet. What I'm doing is:

Code:
ldy #0
lda (pointer_lo), y
sta val1

inc pointer_lo
bne @dont_inc_hi_ptr
inc pointer_hi
@dont_inc_hi_ptr:

lda (pointer_lo), y
sta val2

;... and so on


Whereas from what I can tell i should be able to just increment y, as long as y never exceeds 255?

Code:
ldy #0
lda (pointer_lo), y
sta val1

iny
lda (pointer_lo), y
sta val2

;... and so on


I'll do a little test next time I sit down with my project.

Code:
ldy #10
lda ($90ff), y


If I've understood correctly, this would give me the same result as:

Code:
lda $910f


If so, this should improve the speed of a lot of my OAM buffering and nametable updates.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 25 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group