It is currently Sun Oct 22, 2017 5:43 pm

All times are UTC - 7 hours



Forum rules


Related:



Post new topic Reply to topic  [ 52 posts ]  Go to page 1, 2, 3, 4  Next
Author Message
PostPosted: Thu Jun 30, 2016 3:38 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3074
Location: Nacogdoches, Texas
I know that the reason I crashed and burned last time I tried to implement my crazy vram setup is that I was trying to do a lot, but not having any way to test if what I was doing even worked correctly, and at that point I realized what would make the most sense is to make my metasprite routine work with my vram setup even before I made an animation engine or a vram slot finder. (I do have a tile uploader that works though.) Because I really liked psychopathicteen's linked list idea for vram slots, I decided to implement that into my metasprite routine by having the feature to where it will either stay on the same spot in vram, or will go to the next location on the linked list. It's not very efficient how I did it I imagine, partly because I used up x, y, and the direct page so I had to push and pull x. I actually don't want to have 16x16's and 32x32's for what I'm planning to do (not that many sprites total, but there are a lot of overlaying ones) so I only implemented 16x16 vram slots, as I have a miniature offset for a specific 8x8 in a 16x16 sized slot.

One problem that I encountered is that if I have a smaller sized sprite, it'll flip just like the larger one, so I had it check if the sprite is small or large, and then I would add 8 to the sprite's position if it were flipped. I did it in a very lousy way, but I don't know how else to do it. Also, metasprites just flip wherever instead of according to the width, because I don't really know how to program this and don't feel like thinking to hard considering I pretty much just did this all today.

Enjoy...

Attachment:
Metasprite.zip [234.41 KiB]
Downloaded 111 times

Kind of random, but that stobe-like effect has proven to be pretty useful as a CPU usage meter and especially to tell me if what I am programming has crashed or not.


Top
 Profile  
 
PostPosted: Fri Jul 01, 2016 8:06 am 
Offline

Joined: Wed May 19, 2010 6:12 pm
Posts: 2295
Nice. I'm glad your making progress.


Top
 Profile  
 
PostPosted: Fri Jul 01, 2016 4:12 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3074
Location: Nacogdoches, Texas
Thanks! I had a lot of problems thinking about how to handle things like double buffering, but I think I got a solution. Instead of having a space in ram for the start of where each metasprite's tiles are, I'll have a separate space for the double buffered area. So, on a 64x64 double buffered object, there will be a slot for each 64x32 spot. The top part will still be linked to the bottom on the vram table, but it will first see if the bottom part even exists in vram (and if it doesn't, it will upload it). This is also useful for if you had a tank or something and needed to animate the treads but nothing else. The part that differs (the treads) would follow into the commonly shared part (the tank body). This is somewhat limited, but my original idea was way overcomplicated and had no real purpose, because if you wanted to have it as complicated as I wanted it originally, (kind of like the same as above, but each slot could go into any other, which totally screwed up the metasprite routine) then you would just use another object slot, which I edited to where if the identity is #$0001 (#$0000 is nothing) the object identifier won't jump to it, but the object slot searcher also won't overwrite it like it will with #$0000. So if I am animating a tank in my game engine, the body and treads will be one object, but the turret will be a separate object that doesn't actually have any code, as it is really only for visual purposes. Yeah, I'm not entirely sure how I'm going to program my vram idea, but it doesn't seem too hard.


Top
 Profile  
 
PostPosted: Thu Jul 07, 2016 5:27 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3074
Location: Nacogdoches, Texas
Okay, so I've been successful in nearly everything related to making this, except one thing: I can't seem to get it to where I'm uploading from the correct address. So, it is able to find what slots are empty, make the linked list correctly, and upload the tiles in the correct location, but it just isn't able to upload them from the correct location.

I even tried to make it static to where it is loading "LOWORD(Test1Tiles)" and stuff like that, but it still doesn't work, inexplicably. I had noticed that I have direct page somewhere other than 0, (it's at the start of the object it is currently looking at) and so I've put an "a:" in front of anything else. Is this not always the same as loading something normally when direct page is #$0000? I mean, that's all I can think of.

The code is a mess because I kept running out of hardware registers and things are named poorly (which is easy enough to fix though). I already know I'll have to go back and optimize it, but that's for another day. :lol:

This almost looks like gibberish to me so I don't expect anyone else to be able to understand it, but I figure I might as well post it here. The object's identity is #$0004. #$0000 counts as nothing for objects, and it also counts as nothing for vram slots, that's why a majority of the tables are being offset by -2.

Code:
.proc vram_engine
  rep #$30   ;A=16, X/Y=16
  lda #ObjectTable
  tcd
  ldy #$0002

vram_engine_loop:
  ldx ObjectSlot::RequestedFrame
  beq next_object
  lda a:AnimationFrameSlotUsageTable-2,x
  beq find_vram
  inc a:AnimationFrameSlotUsageTable-2,x
  lda a:AnimationFrameLinkedListTable-2,x
  sta ObjectSlot::VramOffset
  stz ObjectSlot::RequestedFrame

next_object:
  stz ObjectSlot::RequestedFrame
  inx
  inx
  tdc
  clc
  adc #ObjectSlotSize
  cmp #ObjectTable+ObjectTableSize
  bcs vram_engine_done
  tcd
  bra vram_engine_loop

vram_engine_done:
  rts

find_vram:
  lda a:TilesInFrameTable-2,x
  sta a:TilesInFrame
  lda a:VramAddressFrameTable-2,x
  sta a:VramAddressOfFrame
  lda a:VramBankByteFrameTable-2,x
  sta a:BankByteOfFrame

find_vram_loop_1:
  lda a:VramLinkedListTable-2,y
  beq open_slot_found_1
  iny
  iny
  cpy #$0102
  bcc find_vram_loop_1
  bra next_object

open_slot_found_1:
  tya
  sta a:AnimationFrameLinkedListTable-2,x
  sta ObjectSlot::VramOffset

  phy
  ldy a:TileRequestCounter16x16
  lda a:VramAddressToTransferAddressTable-2,x
  sta a:TileRequestTable+VramAddress,y
  lda a:VramAddressOfFrame
  sta a:TileRequestTable+TileAddress,y
  lda a:BankByteOfFrame
  sta a:TileRequestTable+BankNumber,y
  lda a:VramAddressOfFrame
  clc
  adc #$0020
  sta a:VramAddressOfFrame
  lda a:TileRequestCounter16x16
  clc
  adc #$0006
  sta a:TileRequestCounter16x16
  ply

  dec a:TilesInFrame
  beq next_object
  tyx
  iny
  iny
  cpy #$0102
  bcc find_vram_loop_2
  bra next_object

find_vram_loop_2:
  lda a:VramLinkedListTable-2,y
  beq open_slot_found_2
  iny
  iny
  cpy #$0102
  bcc find_vram_loop_2
  bra next_object

open_slot_found_2:
  tya
  sta a:VramLinkedListTable-2,x

  phy
  ldy a:TileRequestCounter16x16
  lda a:VramAddressToTransferAddressTable-2,x
  sta a:TileRequestTable+VramAddress,y
  lda a:VramAddressOfFrame
  sta a:TileRequestTable+TileAddress,y
  lda a:BankByteOfFrame
  sta a:TileRequestTable+BankNumber,y
  lda a:VramAddressOfFrame
  clc
  adc #$0020
  sta a:VramAddressOfFrame
  lda a:TileRequestCounter16x16
  clc
  adc #$0006
  sta a:TileRequestCounter16x16
  ply

  dec a:TilesInFrame
  beq jump_to_next_object
  iny
  iny
  cpy #$0102
  bcc find_vram_loop_2

jump_to_next_object:
  brl next_object
.endproc
Code:
;=========================================================================================
.segment "RODATA"
;=========================================================================================

VramAdressToTileNumberTable:
  .word $0000,$0002,$0004,$0006,$0008,$000A,$000C,$000E
  .word $0020,$0022,$0024,$0026,$0028,$002A,$002C,$002E
  .word $0040,$0042,$0044,$0046,$0048,$004A,$004C,$004E
  .word $0060,$0062,$0064,$0066,$0068,$006A,$006C,$006E
  .word $0080,$0082,$0084,$0086,$0088,$008A,$008C,$008E
  .word $00A0,$00A2,$00A4,$00A6,$00A8,$00AA,$00AC,$00AE
  .word $00C0,$00C2,$00C4,$00C6,$00C8,$00CA,$00CC,$00CE
  .word $00E0,$00E2,$00E4,$00E6,$00E8,$00EA,$00EC,$00EE
  .word $0100,$0102,$0104,$0106,$0108,$010A,$010C,$010E
  .word $0120,$0122,$0124,$0126,$0128,$012A,$012C,$012E
  .word $0140,$0142,$0144,$0146,$0148,$014A,$014C,$014E
  .word $0160,$0162,$0164,$0166,$0168,$016A,$016C,$016E
  .word $0180,$0182,$0184,$0186,$0188,$018A,$018C,$018E
  .word $01A0,$01A2,$01A4,$01A6,$01A8,$01AA,$01AC,$01AE
  .word $01C0,$01C2,$01C4,$01C6,$01C8,$01CA,$01CC,$01CE
  .word $01E0,$01E2,$01E4,$01E6,$01E8,$01EA,$01EC,$01EE
 
VramAddressToTransferAddressTable:
  .word $0000,$0020,$0040,$0060,$0080,$00A0,$00C0,$00E0
  .word $0200,$0220,$0240,$0260,$0280,$02A0,$02C0,$02E0
  .word $0400,$0420,$0440,$0460,$0480,$04A0,$04C0,$04E0
  .word $0600,$0620,$0640,$0660,$0680,$06A0,$06C0,$06E0
  .word $0800,$0820,$0840,$0860,$0880,$08A0,$08C0,$08E0
  .word $0A00,$0A20,$0A40,$0A60,$0A80,$0AA0,$0AC0,$0AE0
  .word $0C00,$0C20,$0C40,$0C60,$0C80,$0CA0,$0CC0,$0CE0
  .word $0E00,$0E20,$0E40,$0E60,$0E80,$0EA0,$0EC0,$0EE0
  .word $1000,$1002,$1004,$1006,$1008,$100A,$100C,$100E
  .word $1200,$1220,$1240,$1260,$1280,$12A0,$12C0,$12E0
  .word $1400,$1420,$1440,$1460,$1480,$14A0,$14C0,$14E0
  .word $1600,$1620,$1640,$1660,$1680,$16A0,$16C0,$16E0
  .word $1800,$1820,$1840,$1860,$1880,$18A0,$18C0,$18E0
  .word $1A00,$1A20,$1A40,$1A60,$1A80,$1AA0,$1AC0,$1AE0
  .word $1C00,$1C20,$1C40,$1C60,$1C80,$1CA0,$1CC0,$1CE0
  .word $1E00,$1E20,$1E40,$1E60,$1E80,$1EA0,$1EC0,$1EE0

;=========================================================================================

TilesInFrameTable:
  .word $0002,$0002

VramAddressFrameTable:
  .word .LOWORD(Test1Tiles)

VramBankByteFrameTable:
  .word .BANKBYTE(Test1Tiles),$00

;=========================================================================================

Test1Tiles:
  .incbin "Test1.pic"

Test2Tiles:
  .incbin "Test2.pic"

;=========================================================================================

Kind of random, but about things that are only used during one routine, I think I'll just have everything use its own space in vram and then just replace the names of everything so that they're all the same, like "Temporary1" or something like that. I have bigger concerns right now though.

Actually, hell, why not, here's the rom:

Attachment:
Vram Engine Test.zip [236.93 KiB]
Downloaded 68 times

Dang it, I keep realizing I have things to say... The vram engine doesn't have any sort of double buffering thing because I realized that I really don't need it right now. I'll incorporate one if I ever get to 16x16 and 32x32 sized sprites.


Top
 Profile  
 
PostPosted: Fri Jul 08, 2016 8:34 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3074
Location: Nacogdoches, Texas
Of course, in my "lol plz help omg" moment, I actually realized half of the mistakes I made. (For starters, the frame's identity was 4, but not all the tables even went to that. :roll: )

I actually have it working now after somewhat randomly moving things around and then trying to make sense of the result, except one thing: I don't get it, if you had a 16x16, 4bpp tile at one offset and then several after that follow it, wouldn't you add #$80 to get the address of each additional tile? I mean, 16x16=256/2=128.

For whatever reason, it's not working, and it's leading me to believe that the assembler is causing the problem in how it is arranging data, because if offset the thing by #$40, it shows half of the first tile.

Yeah, is the data here non-linear?

Code:
TestTiles:
  .incbin "Test1.pic"
  .incbin "Test2.pic"


Top
 Profile  
 
PostPosted: Sat Jul 09, 2016 1:54 am 
Offline

Joined: Sun Mar 27, 2016 7:56 pm
Posts: 137
That should work fine as far as I know, assuming those files are the right size.


Top
 Profile  
 
PostPosted: Sat Jul 09, 2016 5:36 am 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2963
Location: Tampere, Finland
Espozo wrote:
I mean, 16x16=256/2=128.

For whatever reason, it's not working, and it's leading me to believe that the assembler is causing the problem in how it is arranging data, because if offset the thing by #$40, it shows half of the first tile.

$40 != 128.

EDIT: I misread. Owell.

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Last edited by thefox on Sat Jul 09, 2016 7:40 pm, edited 1 time in total.

Top
 Profile  
 
PostPosted: Sat Jul 09, 2016 5:45 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3074
Location: Nacogdoches, Texas
Nicole wrote:
That should work fine as far as I know, assuming those files are the right size.

And apparently, they aren't... They're 512 bytes, for whatever reason: the first fourth is actual graphical data, the rest is 0 filled. I pcx2snes.

Yeah, I just now made a 16x32 picture instead of two 16x16's, (there was really no point in having it split in the first place) and it works perfectly now. I don't think I need to upload another file, as it's not like it looks any different on the surface. :lol: Speaking of uploading files, where does all this information go anyway? I imagine a sever, but who owns it? Okay, yeah, that's irrelevant. :lol:

Anyway, I did find out one thing from all of this that you people might be able to use... It appears pcx2snes just doesn't output any file smaller than 512 bytes. I think we're long overdue for a new tool, but I don't have the kind of skill to make one. (I only know 65816 and a smidge of 80186 assembly.)

Man though, it sucks that it appears absolute addressing always takes one more cycle per instruction, because I'm going to have to fix a lot of my stuff for good performance. :(


Top
 Profile  
 
PostPosted: Sat Jul 09, 2016 8:07 pm 
Offline

Joined: Wed Jul 09, 2008 8:46 pm
Posts: 236
pvSNESLib has gfx2snes, which can handle .pcx, .tga and .bmp files.


Top
 Profile  
 
PostPosted: Sat Jul 09, 2016 8:53 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19115
Location: NE Indiana, USA (NTSC)
I made one in Python that can handle at least BMP and PNG in multiple tile formats, including Super NES 4-bit. It's included with my Super NES project template.


Top
 Profile  
 
PostPosted: Sat Jul 09, 2016 9:49 pm 
Offline

Joined: Fri Jul 04, 2014 9:31 pm
Posts: 788
Espozo wrote:
it appears absolute addressing always takes one more cycle per instruction

That's not quite true. Direct page takes an extra cycle if it's not page-aligned, meaning it takes just as long as an absolute instruction (assuming you're running in FastROM, so the extra byte fetch in the absolute instruction doesn't take any longer than the internal add in the direct-page instruction). And while indexing adds a cycle to absolute instructions if X/Y are 16-bit, it adds a cycle to direct page instructions regardless of the index register size.

So, for simple load/store/add/whatever instructions (not RMW or anything fancy):

- direct - 3 cycles
- absolute - 4 cycles

- direct non-page-aligned - 4 cycles
- direct indexed - 4 cycles
- direct indexed non-page-aligned - 5 cycles

- absolute 8-bit indexed - 4 cycles
- absolute 16-bit indexed - 5 cycles

Notice that for indexed accesses, if X/Y are 8-bit and the bottom byte of DP is nonzero, absolute is faster.

Add one cycle to all of those if the data is 16-bit. I'll knock off there; see 65c816.txt for further information.

If you want to know how many slow cycles each instruction has, just count the number of byte accesses in slow memory. Everything else is fast.


Top
 Profile  
 
PostPosted: Sun Jul 10, 2016 3:20 am 
Offline
User avatar

Joined: Sun Jul 01, 2012 6:44 am
Posts: 337
Location: Lion's den :3
Espozo wrote:
Anyway, I did find out one thing from all of this that you people might be able to use... It appears pcx2snes just doesn't output any file smaller than 512 bytes.

pcx2snes/gfx2snes are known to be buggy, even with bigger files. :|

I'll try out tepples' script shortly. :)

_________________
Some of my projects:
Furry RPG!
Unofficial SNES PowerPak firmware
(See my GitHub profile for more)


Top
 Profile  
 
PostPosted: Sun Jul 10, 2016 7:43 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3074
Location: Nacogdoches, Texas
93143 wrote:
That's not quite true. Direct page takes an extra cycle if it's not page-aligned, meaning it takes just as long as an absolute instruction (assuming you're running in FastROM, so the extra byte fetch in the absolute instruction doesn't take any longer than the internal add in the direct-page instruction). And while indexing adds a cycle to absolute instructions if X/Y are 16-bit, it adds a cycle to direct page instructions regardless of the index register size.

Man, all that is hard to keep track of. :( It would be so awesome if there was a way to have the cycles per instruction shown while you were typing code, but that would mean this theoretical program would have to assemble the whole file each and ever time you did anything. Unfortunately, I don't exactly see SNES (or Apple IIGS) development taking off enough for something this complicated to be made... :lol:

Ramsis wrote:
pcx2snes/gfx2snes are known to be buggy, even with bigger files.

Would you like to have your blank data in the front, or the back? :lol:

Anyway, I got to thinking that my next step in my grand SNES adventure would be to make it where old frames are deleted whenever an object changes frames. I suppose I'll have it to where there's the existing "FrameRequest" thing, but also have a "CurrentFrame". What it will do is see if the frame request is equal to the current frame, and if it is, do nothing. If it isn't, it would upload the frame and copy the frame request into the current frame. It would also get rid of what was then the current frame if nothing else is using it. (There's a counter of how many objects are using a particular frame, so if it's 0, follow the linked list, replacing it with #$0000 on every entry in the linked list as that acts as an empty slot.) I'm not really sure I'll have an animation engine, because it would be a giant mess if I were to implement everything that I want out of it. For example implementing tank treads or tires moving is a major pain: there'd have to be the feature of playing animation at different speeds, and also playing animations backwards. Also if one thing is this fancy, everything has to be, and that could be unnecessarily slow. I think I'll just hardcode everything.


Top
 Profile  
 
PostPosted: Sun Jul 10, 2016 7:59 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19115
Location: NE Indiana, USA (NTSC)
Espozo wrote:
It would be so awesome if there was a way to have the cycles per instruction shown while you were typing code, but that would mean this theoretical program would have to assemble the whole file each and ever time you did anything.

That might be a job for a profiler. Instead of breakpoint, you set a profile point on a JSR, and the emulator counts cycles for you.

Espozo wrote:
I suppose I'll have it to where there's the existing "FrameRequest" thing, but also have a "CurrentFrame". What it will do is see if the frame request is equal to the current frame, and if it is, do nothing. If it isn't, it would upload the frame and copy the frame request into the current frame.

So far this sounds like the scheme I used for Haunted: Halloween '85. I was sometimes able to fit two frames' tiles into one slot if they shared many, so that I could get away with this trivial frame request more often.

Espozo wrote:
There's a counter of how many objects are using a particular frame

And there it differs. You're using reference-counted GC, which is quite a bit more complicated than what I used. I just fully double-buffered all actors' cels, which was fine for the number and size of enemies that engine supported but may not be fine for a more detailed game.


Top
 Profile  
 
PostPosted: Sun Jul 10, 2016 8:21 pm 
Offline
User avatar

Joined: Mon Sep 15, 2014 4:35 pm
Posts: 3074
Location: Nacogdoches, Texas
So I take it you took the more simple but faster method of having a fixed spot in vram for each object? Yeah, I can't think of a single game that is doing anything as complicated as I am, but I am worried about how it will run. However, I won't concern myself with compression because I'll just make the cartridge bigger, so that'll save a good amount of time. My whole thing is trying to fit as much data into the little 16KB of vram available to sprites as possible, because I think one of the main differences you can tell about an SNES game vs a Neo Geo game or something is how much less diverse the backgrounds and especially the sprites are on the SNES because most everything is often crammed into vram and never swapped out. (Often times, the only thing that is is the character.) Heck, most games don't even seem to go anywhere near using the whole vram bandwidth, which isn't even large to begin with.

Anyway, I'll post when I get my slot deletion thing working. The current slot and frame slot approach seems to be the best, which is why it also seems pretty popular.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 52 posts ]  Go to page 1, 2, 3, 4  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: Google [Bot] and 4 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group