It is currently Wed Nov 22, 2017 9:56 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 55 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
PostPosted: Mon Dec 05, 2016 1:19 am 
Offline

Joined: Mon Dec 14, 2015 2:58 am
Posts: 20
I misunderstood that probably, I was just reading ASM6 readme and there was this:
Code:
        Repeat a block of code a specified number of times.
        Labels defined inside REPT are local.

                i=0
                REPT 256
                    DB i
                    i=i+1
                ENDR

so I thought DB i is required for the loop to work properly. The program didn't display any sprites at all without that.


Top
 Profile  
 
PostPosted: Mon Dec 05, 2016 2:34 am 
Offline
Formerly WheelInventor

Joined: Thu Apr 14, 2016 2:55 am
Posts: 1017
Location: Gothenburg, Sweden
AFAIK (correct me if i'm wrong), the .rept directive isn't meant for doing runtime loops, because it repeats your code at assembly-time. Your assembled code will get very long as .rept will hard-code every repetition for you. You need to write a for-next loop or the like in asm for it to be practical in the long run.

Load a value into a register (number of iterations),
set up a label
do stuff here
decrease that register
branch to label if not equal

Edit: Ok, verified. Here's an illustrative example of what happens if you'd assemble something simple like this nonsense code:
Code:
i=0
.rept 256
lda #45
sta $4012
i=i+1
.endr


It turns out like in the attached file.


Attachments:
File comment: Disassembled binary included
ohmy.asm [26.15 KiB]
Downloaded 47 times

_________________
http://www.frankengraphics.com - personal NES blog
Top
 Profile  
 
PostPosted: Mon Dec 05, 2016 4:27 am 
Offline

Joined: Mon Dec 14, 2015 2:58 am
Posts: 20
What should I do to make loading metasprites the most efficent then? Is that first version with four INYs the best or maybe I should just stick to manual LDAs and STAs for every sprite, without any kind of "automation"?


Top
 Profile  
 
PostPosted: Mon Dec 05, 2016 4:49 am 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
WheelInventor wrote:
AFAIK (correct me if i'm wrong), the .rept directive isn't meant for doing runtime loops, because it repeats your code at assembly-time. Your assembled code will get very long as .rept will hard-code every repetition for you.

This is correct. A better way to describe it is: .rept/.endr allows you to "repeat generation of assembly code". E.g.
Code:
lda #$00
i=$1f
.rept 4
sta $0200+i
i=i+1
.endr

Would automatically generate the following code for you:
Code:
lda #$00
sta $021f
sta $0220
sta $0221
sta $0222

If you aren't sure what will be generated and want to see the results, use asm6's -l (lowercase-ELL) or -L (uppercase-ELL) flag and specify a listing file as the last argument, e.g. asm6 -L mygame.asm mygame.nes mygame.lst. Then open the listing file in a text editor. You'll see the code it generated for you. -L is a more verbose version of -l. Refer to asm6's documentation for all of this.


Top
 Profile  
 
PostPosted: Mon Dec 05, 2016 6:11 am 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2981
Location: Tampere, Finland
.rept is fine if you have a low number of repetitions (really it's fine even with a high number of repetitions if you can afford the space losses). However, it's a good idea to calculate how much benefit you'll actually get from it with different factors of unrolling to figure out a good size/speed tradeoff. (For example, if your loop body is large/slow, you'll get next to no benefit from unrolling since it only serves to remove the branching overhead.)

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Top
 Profile  
 
PostPosted: Mon Dec 05, 2016 8:14 am 
Offline
Formerly WheelInventor

Joined: Thu Apr 14, 2016 2:55 am
Posts: 1017
Location: Gothenburg, Sweden
Quote:
What should I do to make loading metasprites the most efficent then?

Apologies, i think i may have caused more confusion than clarity; i shouldn't write when in a hurry. Unrolling may be the most efficient way if your function needs to execute quickly and you have the space to spare. .rept may be the most convenient way of doing it, depending on personal preference.

I think bogax's example is neat too, you can try them on both for size and if you don't notice a difference in performance, pick whatever's most clear to you.

That DB i in the example shouldn't go in your code at all, partly because it doesn't fill a purpose (in your code), partly because it lays down a data rock in the middle of the program road. It will execute as an operation because the microprocessor had just finished its last operation and is now expecting a new one. Let's say DB i is laying down a $01 there, for example. It would execute as an ORA (bitwise OR with A) and take the next byte of program as its operand. From there, everything is likely to be interpreted wrong because the byte after that might have been an operand, but may now be an operation, and so on. Or if it will be laying down a $00 it will execute as a BRK, which seems to be the case here (the first time around).

_________________
http://www.frankengraphics.com - personal NES blog


Top
 Profile  
 
PostPosted: Tue Dec 06, 2016 2:58 pm 
Offline

Joined: Mon Dec 14, 2015 2:58 am
Posts: 20
Almost there. The only problem is that for some reason it always loads tile $00 into $201. The rest is fine.
Code:
LoadYacaBackFrame1:
  LDX $00
  -
  LDY yacaloadtable, x
  LDA yacaback1, x
  STA $201, y
  INX
  CPX $08
  BNE -
  RTS

yacaback1:
  .db $02, %00000000, $02, %01000000, $12, %00000000,$13, %00000000
yacaloadtable:
  .db $00, $01, $04, $05, $08, $09, $0C, $0D


Top
 Profile  
 
PostPosted: Tue Dec 06, 2016 4:15 pm 
Offline
User avatar

Joined: Sun Sep 19, 2004 9:28 pm
Posts: 3192
Location: Mountain View, CA, USA
Are you sure LDX $00 and CPX $08 shouldn't be LDX #$00 and CPX #$08? I would think so given the routine, but maybe you're storing some details in ZP locations $00 and $08 and it just isn't obvious from the code.


Top
 Profile  
 
PostPosted: Wed Dec 07, 2016 2:26 am 
Offline

Joined: Mon Dec 14, 2015 2:58 am
Posts: 20
Yes, thank you. I was too tired yesterday to see that stupid mistake.


Top
 Profile  
 
PostPosted: Wed Mar 15, 2017 3:27 am 
Offline

Joined: Mon Dec 14, 2015 2:58 am
Posts: 20
All right, the result of my efforts is starting to resemble a game. I switched to ASM6. There is some stuff I'd like to ask about.
1. My code for checking collisions.
Code:
 CheckPickupCollision:
  LDA $204
  SEC
  SBC #$08
  CMP PICKUP_RAM
  BEQ +
  BCC +
  JMP CheckPickupCollisionRTS
  +
  LDA $210
  CLC
  ADC #$08
  CMP PICKUP_RAM
  BCS +
  JMP CheckPickupCollisionRTS
  +
  LDA $203
  CMP PICKUP_RAM + 3
  BEQ +
  BCC +
  JMP CheckPickupCollisionRTS
  +
  LDA $20B
  CLC
  ADC #$08
  CMP PICKUP_RAM + 3
  BCS +
  JMP CheckPickupCollisionRTS
  +

Pickup1CollidesPlayer1:
  LDA #$FF
  STA PICKUP_RAM
  STA PICKUP_RAM + 3

  LDA #BEAM
  STA P1_SPECIAL_TYPE
  LDA #$03
  STA P1_SPECIAL_AMOUNT

CheckPickupCollisionRTS:
  RTS

Is there any better way to do this? Will it hurt the performance if I loop through every enemy, bullet etc. every frame if done that way?

2. My idea for the game engine.
I want to make the game run in a way that utilizes a stage timer and runs certain "patterns" of enemies in the right moment.

Code:
BasicEnemyPattern01:
  LDX PATTERN_EXECUTION_DELAY
  CPX #$00
  BNE BasicEnemyPattern01_Execute_Prepare_Loop

  LDX PATTERN_EXECUTION_IND1
  CPX #$14
  BEQ BasicEnemyPattern01_Execute_Prepare_Loop

BasicEnemyPattern01_Load:
  LDY basicenemyloadtable, x

  LDA #$21
  STA BASIC_ENEMY_RAM + 1, y
  LDA #%00000001
  STA BASIC_ENEMY_RAM + 2, y

  LDA #$00
  STA BASIC_ENEMY_RAM + 3, y
  LDA #$04
  STA BASIC_ENEMY_RAM, y

  INX

  LDA #$22
  STA BASIC_ENEMY_RAM + 5, y
  LDA #%00000001
  STA BASIC_ENEMY_RAM + 6, y

  LDA #$08
  STA BASIC_ENEMY_RAM + 7, y
  LDA #$04
  STA BASIC_ENEMY_RAM + 4, y
 
  INX
  STX PATTERN_EXECUTION_IND1
  LDA #$20
  STA PATTERN_EXECUTION_DELAY

BasicEnemyPattern01_Execute_Prepare_Loop:
  LDX #$00
BasicEnemyPattern01_Check_Loop_Conditions:
  LDY basicenemyloadtable, x

  LDA BASIC_ENEMY_RAM + 1, y
  CMP #$21
  BEQ BasicEnemyPattern01_Loop_Section1
  CMP #$23
  BNE BasicEnemyPattern01_INC_Register


BasicEnemyPattern01_Loop_Section1:
  LDA BASIC_ENEMY_RAM + 3, y
  CLC
  ADC #$01
  STA BASIC_ENEMY_RAM + 3, y

  LDA BASIC_ENEMY_RAM, y
  CLC
  ADC #$02
  STA BASIC_ENEMY_RAM, y
   
  LDA ANIM_TIMER_2
  CMP #$00
  BNE BasicEnemyPattern01_Loop_Section2
  LDA BASIC_ENEMY_RAM + 1, y
  CMP #$21
  BEQ ShowBasicEnemyFrame2
  CMP #$23
  BNE BasicEnemyPattern01_Loop_Section2

ShowBasicEnemyFrame1:
  LDA #$21
  STA BASIC_ENEMY_RAM + 1, y
  LDA #$00
  STA CURRENT_BASIC_ENEMY_FRAME
  JMP BasicEnemyPattern01_Loop_Section2
ShowBasicEnemyFrame2:
  LDA #$23
  STA BASIC_ENEMY_RAM + 1, y
  LDA #$01
  STA CURRENT_BASIC_ENEMY_FRAME

BasicEnemyPattern01_Loop_Section2:
  INY
  INY
  INY
  INY
   
  LDA ANIM_TIMER_2
  CMP #$00
  BNE BasicEnemyPattern01_Loop_Section3
  LDA CURRENT_BASIC_ENEMY_FRAME
  CMP #$00
  BNE +
  LDA #$22
  STA BASIC_ENEMY_RAM + 1, y
  JMP BasicEnemyPattern01_Loop_Section3
  +
  CMP #$01
  BNE +
  LDA #$24
  STA BASIC_ENEMY_RAM + 1, y
  +

BasicEnemyPattern01_Loop_Section3:
  LDA BASIC_ENEMY_RAM + 3, y
  CLC
  ADC #$01
  STA BASIC_ENEMY_RAM + 3, y

  LDA BASIC_ENEMY_RAM, y
  CLC
  ADC #$02
  STA BASIC_ENEMY_RAM, y


BasicEnemyPattern01_INC_Register:
  INX
  CPX #$14
  BNE BasicEnemyPattern01_Check_Loop_Conditions

BasicEnemyPattern01_End:
  RTS


What this block of code does is basically:
*loads an enemy made of 2 tiles every 20 frames in the top left corner, up to 10 enemies
*loops through the their memory locations to see if the enemy is already loaded
*it moves sprites of loaded enemies and does a simple 2 frame animation of them

Same question there, can something like this affect the performance or should I worry if the game starts to actually lag?

3. Tile number 0 in the top left corner.
I have no idea where does it come from. I commented all of my own subroutines, it's still there. I copied the init code from Nerdy Nights, maybe that's where the problem is?

4. Is there a chance of modifying a value inside the register when 2 threads are running at once (separate engine loop and nmi code structure)

This code handles player animation during the NMI:

Code:
P1_Idle_Animation:
  LDA P1_CURRENT_FRAME
  CMP #$00
  BNE +
  JSR Player1_LoadSprite_Idle1
  +
  LDA P1_CURRENT_FRAME
  CMP #$01
  BNE +
  JSR Player1_LoadSprite_Idle2
  LDA P1_CURRENT_FRAME
  +
  LDA P1_CURRENT_FRAME
  CMP #$02
  BNE +
  JSR Player1_LoadSprite_Idle3
  +
  LDA P1_CURRENT_FRAME
  CMP #$03
  BNE +
  JSR Player1_LoadSprite_Idle4
  +
  LDA P1_CURRENT_FRAME
  CMP #$04
  BNE +
  JSR Player1_LoadSprite_Idle5
  +
  LDA P1_CURRENT_FRAME
  CMP #$05
  BNE +
  JSR Player1_LoadSprite_Idle6and8
  +
  LDA P1_CURRENT_FRAME
  CMP #$06
  BNE +
  JSR Player1_LoadSprite_Idle7
  +
  LDA P1_CURRENT_FRAME
  CMP #$07
  BNE +
  JSR Player1_LoadSprite_Idle6and8
  +
  LDA P1_CURRENT_FRAME
  CMP #$08
  BNE +
  JSR Player1_LoadSprite_Idle9
  +
  LDA P1_CURRENT_FRAME
  CMP #$09
  BNE +
  JSR Player1_LoadSprite_Idle10
  +
  LDA P1_CURRENT_FRAME
  CMP #$0A
  BNE +
  JSR Player1_LoadSprite_Idle11
  +
  LDA P1_CURRENT_FRAME
  CMP #$0B
  BNE +
  JSR Player1_LoadSprite_Idle12
  +
  RTS

Are all these LDA P1_CURRENT_FRAME really neccessary or should I just load the value at the beginning only?

5. Displaying text.
Any good examples of displaying text on ASM6? How do I create a table aligning letters to certain tiles?

6. Keeping score larger than 255.
I wonder how do you guys do it. My idea is like this:
Let's say the score is made of 3 digits (CBA) and a player gets one point for every enemy killed. Then:

if A >= 10
A = A - 10
B = B + 1

if B >= 10
B = B - 10
C = C + 1

Good or bad idea?



7. If I want to set up 10 memory locations for enemy hitpoints should I just use
BASIC_ENEMY_HP .dsb 10
and then address it like
LDA BASIC_ENEMY_HP
LDA BASIC_ENEMY_HP + 1
LDA BASIC_ENEMY_HP + 2, etc.?

Thanks in advance.


Attachments:
dbis_engine.bin [48.02 KiB]
Downloaded 35 times
Top
 Profile  
 
PostPosted: Wed Mar 15, 2017 4:51 am 
Offline
User avatar

Joined: Fri May 08, 2015 7:17 pm
Posts: 1828
Location: DIGDUG
Quote:
Is there any better way to do this?


Yes.

I don't have time to go over your entire code, but I feel like your collision code needs to be scrapped and rewritten.

Let's talk about ASM code in general.

Code:
LDA $204
  SEC
  SBC #$08
  CMP PICKUP_RAM
  BEQ +
  BCC +
  JMP CheckPickupCollisionRTS
  +


The only thing at CheckPickupCollisionRTS is an RTS
This jump can be replaced with the line 'RTS' and save 2 bytes(of ROM space), and be faster, as you don't have to JMP first.(speed)

What happens if $204 (I'm assuming your character's Y position) is less than 8? Your code breaks down.

You are doing 2 comparisons here, because you are asking is A <= B. BEQ for =. BCC for <. This can be done in 1 comparison, if you reverse the terms. Is B >= A. That only needs a BCS. Such as...

Code:
LDA PICKUP_RAM
CMP $204
BCS +
RTS
+


You shouldn't use OAM RAM as permanent definition of a character. It makes your code inflexible, and sub-pixel movement impossible.

You can take 2 approaches for a sub-routine like this. 1. You can make a generic routine that can handle every kind of collision by any 2 objects (slow) or 2. Make a specific one for the most used object (player) and the most common sized object to collide with (16x16 or 8x8).

Or you can have multiple similar subroutines.

I'll take a look at the rest of your code later.

_________________
nesdoug.com -- blog/tutorial on programming for the NES


Top
 Profile  
 
PostPosted: Wed Mar 15, 2017 9:24 am 
Offline
User avatar

Joined: Fri May 08, 2015 7:17 pm
Posts: 1828
Location: DIGDUG
Quote:
3. Tile number 0 in the top left corner.


Look at the OAM mirror at $200.

Attachment:
DBIS.png
DBIS.png [ 10.12 KiB | Viewed 925 times ]


The second half of this page is filled with zero. Each of those (4 byte chunks) is directing the OAM to draw a sprite at position 0,0, using tile 0 and palette 0.

I like to set Y position of unused sprites somewhere between $f0 and $ff (off the bottom of the screen)...

ff 00 00 00 ff 00 00 00 etc.

Quote:
if A >= 10
A = A - 10
B = B + 1

if B >= 10
B = B - 10
C = C + 1

Good or bad idea?


Seems good to me.

_________________
nesdoug.com -- blog/tutorial on programming for the NES


Top
 Profile  
 
PostPosted: Wed Mar 15, 2017 1:52 pm 
Offline

Joined: Wed Nov 30, 2016 4:45 pm
Posts: 93
Location: Southern California
Quote:
Are all these LDA P1_CURRENT_FRAME really neccessary or should I just load the value at the beginning only?

One way to do it is to load it once and decrement it each time. The 65c02 (CMOS) has a DEA (decrement accumulator) instruction; but if you're using an NMOS 6502, you'll have to use X or Y, then do for example:
Code:
P1_Idle_Animation:
  LDY P1_CURRENT_FRAME
  BNE +
  JSR Player1_LoadSprite_Idle1    ; Look out!  The RTS will come back here to continue
                                  ; the comparisons, which I doubt is what you wanted.
  +
  DEY
  BNE +
  JSR Player1_LoadSprite_Idle2
  +
  DEY
  BNE +
  JSR Player1_LoadSprite_Idle3
  +
      <etc.>

Note that there's an automatic, implied compare-to-zero built into LDA, LDX, LDY, INC, INX, INY, DEC, DEX, DEY, INA, DEA, AND, ORA, EOR, ASL, LSR, ROL, ROR, PLA, PLX, PLY, SBC, ADC, TAX, TXA, TAY, TYA, and TSX. This means that, for example, a CMP #0 after an LDA or DEA is redundant, a wasted instruction, as is CPY #0 after LDY or DEY etc., or CPX #0 after LDX or DEX etc.. There are many places in your code where you can take advantage of this to shorten things.

The decrementing above means that the first time, you'll be testing for 0, the next time for 1, since the 1 gets decremented to 0 and the BNE is not taken but instead the JSR is (heed my note above about the JSR though! Where do you want it to go when the subroutine is finished?), the next time for 2, since now it has been decremented twice to 0, etc..

A more efficient way to do it however, since you have a succession of numbers to test for, is to do a jump table. Then you don't have to test at all. (I suspect you wanted a JMP, not a JSR, as I put in the comment beside the code above.) The CMOS 6502 (65c02) has a JMP(abs,X) instruction. If you're limited to an NMOS 6502 which doesn't have that, you can still synthesize it with the stack, this way:
Code:
        LDA  P1_CURRENT_FRAME
        ASL  A
        TAX              ; Start with function (an even number) in X.
        LDA  TABLE+1,X   ; Read high address byte from the actual table, and
        PHA              ; push it.  Low byte comes next, below.
        LDA  TABLE,X     ; Be sure to make the table reflect start addresses
        PHA              ; minus 1, since RTS increments the address by 1.
        RTS              ; RTS does the absolute indexed indirect jump.

Then TABLE will have 24 bytes, two for each of your twelve routine addresses. Note that the RTS here is being used as a jump, not as a return. ASM6 is not one of the assemblers I'm familiar with; but perhaps it would let you make a macro of the above routine, to get it on a single line. (You can read about the differences between the CMOS and NMOS 6502 in this article.)

_________________
http://WilsonMinesCo.com/ lots of 6502 resources


Top
 Profile  
 
PostPosted: Wed Mar 15, 2017 3:27 pm 
Offline
User avatar

Joined: Fri May 08, 2015 7:17 pm
Posts: 1828
Location: DIGDUG
Why are we talking about 65c02?

The NES uses a Ricoh 2A03, which is essentially a copy of the MOS 6502 chip, except that it lacks decimal mode.

Let's not confuse new programmers.

(I believe the Turbo Graphx 16 / PC Engine used a 65c02 processor).

_________________
nesdoug.com -- blog/tutorial on programming for the NES


Top
 Profile  
 
PostPosted: Wed Mar 15, 2017 5:40 pm 
Offline

Joined: Tue May 28, 2013 5:49 am
Posts: 874
Location: Sweden
Yes the Hu6280 in the PC-Engine/Turbografx is Hudsons improvement of the 65C02 (or is it 65SC02?). It doesn't have WAI and STP that certain variations of 65C02 has, but it has a number of other improvements. Some new instructions that Hudson added are very PC Engine specefic though, like the instructions that quickly writes immediate values to $0000, $0001 and $0002 which are commonly used hardware registers on PC Engine (zero page is strangely at $2000 on Hu6280 which makes assemblers like ca65 not work very well for it).

Sorry to derail...


Last edited by Pokun on Fri Mar 17, 2017 12:32 pm, edited 1 time in total.

Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 55 posts ]  Go to page Previous  1, 2, 3, 4  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group