It is currently Sun Dec 17, 2017 3:09 am

All times are UTC - 7 hours



Forum rules


Related:



Post new topic Reply to topic  [ 59 posts ]  Go to page 1, 2, 3, 4  Next
Author Message
 Post subject: 2bpp bullet rendering
PostPosted: Tue Feb 09, 2016 10:28 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
I have an idea for rendering bullets for a bullet hell shmup. Store 2bpp bullet sprite patterns and their transparency masks in the ROM at 8 horizontal scroll values. One row of 8 pixels would work like this:

lda buffer,y
and mask,x
ora pattern,x
sta buffer,y

That's already 8 pixels in ~20 cycles. I'm not too sure if that's enough though.


Top
 Profile  
 
PostPosted: Wed Feb 10, 2016 12:02 am 
Offline
User avatar

Joined: Sat Jul 25, 2015 1:22 pm
Posts: 501
I'm not sure if I understand the idea for this. If your bullets are sprites, why do you need to update the pattern? Wouldn't it be easier to have a single sprite shared between as many bullets as you want, and move the bullets on-screen via object position?

You're going to need a value with which to calculate collisions anyway. Perhaps I'm missing something, but the explanation is very vague.

It could be that I'm missing something because this is SNES and not NES, but you can still move sprites on SNES, right?

If you can figure out what they did for Aero Fighters, from what I've seen I feel like that game has probably the best bullet:slowdown ratio for the console.


Top
 Profile  
 
PostPosted: Wed Feb 10, 2016 12:21 am 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3163
Location: Nacogdoches, Texas
darryl.revok wrote:
I'm not sure if I understand the idea for this. If your bullets are sprites, why do you need to update the pattern?

But they aren't sprites. :wink: It's BG3 being used as a screen with a bunch of bullets being "painted" on it. The reason you'd do this is to avoid the sprite limit and the sprite pixel per scanline limit. It's really a shame that oam can't just be updated during vblank, because I think sprites are being drawn to a linebuffer then?

Wait... Couldn't you just write to oam during active display? I must be missing something, because this is way to obvious. I just don't see how oam could be used during hblank and active display, because I thought I remembered hearing about how sprites are drawn and then BGs or something like that.

Sorry for derailing this already. Sprites are really one of those situations where I'm not completely sure why they aren't just meant to be CPU driven, as in it wouldn't just have the same number of sprites per scanline as the total and you'd just multiplex it. Doesn't the Amiga actually work this way? Oh wait, I just said I was sorry for interrupting this. :lol:


Top
 Profile  
 
PostPosted: Wed Feb 10, 2016 12:22 am 
Offline

Joined: Thu Aug 12, 2010 3:43 am
Posts: 1589
Honestly, if I were to make a bullet hell for a 4th gen platform, I'd try to come up with regular patterns that are easy to recreate (e.g. by using tilemaps). Bonus in that it simplifies collision calculations (e.g. if there's a row of bullets, first check if the ship is inside the row, then if within the loop it's a bullet or a gap)


Top
 Profile  
 
PostPosted: Wed Feb 10, 2016 12:32 am 
Offline
User avatar

Joined: Mon Mar 02, 2015 1:11 am
Posts: 76
Location: Australia (PAL)
psycopathicteen wrote:
That's already 8 pixels in ~20 cycles. I'm not too sure if that's enough though.

Umm, according to the bsnes/higan source it would be 24 cycles if the Accumulator & Index registers are 16 bits long.


Anyway, I thought about coding a bullet hell game. Let me find my notes.

There would be 2 buffers, one for player bullets, one for enemy bullets. Each buffer would be a 1bpp bitmap, 256x192 px in size (6144bytes). Bullets would be a single pixel in size.

Some VMAIN magic would combine the two buffers into a single 2bpp tileset.

Transfer One: player bullet buffer DMA DMAP_TRANSFER_1REG to VMDATAL with VMAIN set to $04 (increment on VMDATAL, 8 bit address shift).
Transfer Two: enemy bullet buffer DMA DMAP_TRANSFER_1REG to VMDATAH with VMAIN set to $84 (increment on VMDATAH, 8 bit address shift).

I never actually implemented this. My napkin-math suggested that I would not have been able to fit 250 bullets and 10 enemies onto the screen at 30fps (I lost that sheet and I can't remember how I got that conclusion),

Draw Bullet Code:
Code:
.A8
.I16
; DP = bullet address

    LDA z:Bullet::xPos
    AND #$07
    TAY

    LDA z:Bullet::xPos
    LSR
    LSR
    LSR
    STA tmp

    REP #$30
.A16
    LDA z:Bullet::yPos
    AND #$00FF
    XBA
    LSR
    LSR
    LSR
    ; C always clear
    ; value at address tmp+1 is always 0
    ADC tmp
    TAX

    ; X = (xPos & 7)
    ; Y = yPos * 32 + xPos / 8
    SEP #$20
.A8
    LDA buffer, X
    ORA SetBulletTable, Y
    STA buffer, X
Code:
SetBulletTable:
.repeat 8, i
    .byte 1 << i
.endrepeat


Collision code would have been pixel perfect:
Code:
.A16
.I16
Check_8x8Collision:
    ; X = frame collsion data offset + (xPos & 7) * 2
    ; Y = yPos * 32 + xPos / 8

.repeat 8, i
    LDA buffer + i * 32, Y
    AND frameCollisionData + i * 16 * 2, X
    BNE CollisionOccoured
.endrepeat

   ; no collision

CollisionOccoured:
    ; collision code
Code:
; CollisionData
; -------------
.macro _buildRow data
    .repeat 8, i
        .word data << i
    .endrepeat
.endmacro

CollisionDataFrame1:
    _buildRow %00011000
    _buildRow %00011000
    _buildRow %00111100
    _buildRow %00111100
    _buildRow %00111100
    _buildRow %00111100
    _buildRow %01111110
    _buildRow %11111111
.endrepeat


EDIT: Added info about combining buffers.


Top
 Profile  
 
PostPosted: Wed Feb 10, 2016 6:08 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
I was basically thinking of doing this.

Making normal objects act like a normal game with hardware sprites, but also have a layer of software bullets on top of it. (Or underneath it, wait I need to check how priorities work again) The whole game would probably run at 30fps, alternating between updating the normal sprites and backgrounds, and updating bullets. It would need the screen to be cropped at 184 pixels in order to fit a whole 2bpp screen in one frame.


Top
 Profile  
 
PostPosted: Wed Feb 10, 2016 9:52 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
This is 256 bullets moving at 20fps.


Attachments:
bullet hell.zip [2.3 KiB]
Downloaded 99 times
Top
 Profile  
 
PostPosted: Thu Feb 11, 2016 12:14 am 
Offline
User avatar

Joined: Mon Mar 02, 2015 1:11 am
Posts: 76
Location: Australia (PAL)
psycopathicteen wrote:
This is 256 bullets moving at 20fps.


Nice.
The movements are smoother than I expected them to be.

Are you going to do more with this?


Top
 Profile  
 
PostPosted: Thu Feb 11, 2016 8:13 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
Quote:
Umm, according to the bsnes/higan source it would be 24 cycles if the Accumulator & Index registers are 16 bits long.


At first I didn't know what you were talking about, but then I found this at http://www.defence-force.org/computing/ ... /annexe_2/:

Quote:
3) Add 1 cycle if adding index crosses a page boundary


I seriously never knew that. Surprisingly the long index addressing doesn't have a similar limitation. I wonder if that was just something left over from the 6502, because I don't see why a CPU with a 16-bit ALU would need to do that.


Top
 Profile  
 
PostPosted: Thu Feb 11, 2016 5:51 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3163
Location: Nacogdoches, Texas
psycopathicteen wrote:
I was basically thinking of doing this.Making normal objects act like a normal game with hardware sprites, but also have a layer of software bullets on top of it.

I think 93143 is doing the same thing.

psycopathicteen wrote:
It would need the screen to be cropped at 184 pixels in order to fit a whole 2bpp screen in one frame.

Why not double buffer?


Top
 Profile  
 
PostPosted: Thu Feb 11, 2016 6:43 pm 
Offline

Joined: Thu Aug 12, 2010 3:43 am
Posts: 1589
Presumably you need memory for everything else too. (also could be referring to transfer bandwidth)

Also I was thinking, most bullet hells are vertical. You could probably just use that ad an excuse to render only half the screen (the extra space would be presumably used for the HUD)


Top
 Profile  
 
PostPosted: Thu Feb 11, 2016 9:56 pm 
Offline
User avatar

Joined: Mon Mar 02, 2015 1:11 am
Posts: 76
Location: Australia (PAL)
psycopathicteen wrote:
I seriously never knew that. Surprisingly the long index addressing doesn't have a similar limitation. I wonder if that was just something left over from the 6502, because I don't see why a CPU with a 16-bit ALU would need to do that.


It is because absolute index addressing can increment the bank when the index crosses the bank boundary. This means that that 2 processing cycles are needed to preform the 24 bit addition with the 65816's 16 bit ALU.

The 65816 first preforms an 8 bit addition between the low byte of the address and the low byte of the index when it reads ADDR.H.
In the next cycle preforms a 16 bit addition between the 16 bit DB:ADDR.H and IH.
If the page boundary is never crossed (8 bit index && carry of {ADDR.L + I} is 0) then DB:ADDR.H is unchanged and the addition is skipped, saving an unneeded cycle. (source)

With absolute long addressing the second addition is processed in the half-cycle after the bank byte is read from memory and will not save a cycle if skipped.

EDIT: added source, reordered sentences.


Top
 Profile  
 
PostPosted: Fri Feb 12, 2016 12:29 pm 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2431
Quote:
Add 1 cycle for indexing across page boundaries, or write, or X=0


Do they mean X=0 as in the status register bit that controls the size of the index registers? So does that mean that it always take an extra cycle when the index registers are 16-bit?


Top
 Profile  
 
PostPosted: Fri Feb 12, 2016 9:40 pm 
Offline
User avatar

Joined: Mon Mar 02, 2015 1:11 am
Posts: 76
Location: Australia (PAL)
psycopathicteen wrote:
Quote:
Add 1 cycle for indexing across page boundaries, or write, or X=0

Do they mean X=0 as in the status register bit that controls the size of the index registers? So does that mean that it always take an extra cycle when the index registers are 16-bit?

Yes, that extra cycle always occurs when the Index registers are 16 bit.


Top
 Profile  
 
PostPosted: Fri Feb 12, 2016 9:58 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 818
Espozo wrote:
psycopathicteen wrote:
I was basically thinking of doing this.Making normal objects act like a normal game with hardware sprites, but also have a layer of software bullets on top of it.
I think 93143 is doing the same thing.

Yeah, for a port of an existing game. But I'm not doing software rendering on the S-CPU, partly because of the sheer number, size, and colour depth of bullets in the original, and partly because I got stubborn about look&feel and blew 3/4 of my CPU budget on raster effects. I'm using the Super FX chip for bullets and collisions, which moves me out of direct competition with anything that doesn't need a coprocessor.

Also, it's been almost two years and I still haven't done a bullet test. Advantage: not me...


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 59 posts ]  Go to page 1, 2, 3, 4  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 7 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group