It is currently Mon Oct 23, 2017 9:17 pm

All times are UTC - 7 hours





Post new topic Reply to topic  [ 26 posts ]  Go to page 1, 2  Next
Author Message
 Post subject: Striped tables in ca65
PostPosted: Fri Sep 16, 2016 11:09 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
Location: Canada
A discussion began in another thread (link) discussing how it's useful to break up tables into stripes; i.e. instead of a table of 16-bit values, create 2 tables of 8-bit values, so you don't have to multiply your index by 2, etc.

zzo38 mentioned that he'd implemented it in his Unofficial-MagicKit assembler, and I lamented that it wasn't doable in ca65. However, after digging around in the documentation and/or source code, I have noticed a somewhat reasonable method for doing this in ca65, and I thought I'd share it here in a new thread rather than leave it buried in the other thread (link).

Here's what it looks like:
Code:
; allow line continuation feature
.linecont +

; create the table as a multi-line define
.define MyTable \
   $1234, \
   $5678, \
   $9ABC

; emit the striped tables
mytable_lo: .lobytes MyTable
mytable_hi: .hibytes MyTable

Maybe the line continuations with a define isn't the prettiest look, but it does seem to do the job. Labels or other expressions seem just as good as literals, too, and there doesn't seem to be any inherent size limit on the number of entries (the .define is stored as a linked list of tokens). You can put multiple expressions on a line if desired (the whole .define is being treated as if it were one long line).


Top
 Profile  
 
PostPosted: Fri Sep 16, 2016 11:16 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
Location: Canada
Alternative methods I'd tried previously:
  • Making two lists manually.
  • Striped tables generated by external program.
  • Putting all the table entries as macros in a single file, including that file twice with the macro redefined. (Example below.)
Code:
table_lo:
.define TABLE_ENTRY(xxx) .byte <(xxx)
.include "table.s"

table_hi:
.undef TABLE_ENTRY
.define TABLE_ENTRY(xxx) .byte >(xxx)
.include "table.s"


Top
 Profile  
 
PostPosted: Fri Sep 16, 2016 11:54 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
rainwarrior wrote:
Here's what it looks like:

Interesting. I remember seeing something like that straight out the ca65 documentation, but never thought about using it for larger pieces of data. It's still only useful for word values though.

So far I've been using dynamic symbols to "buffer" values that I want to output separately later, but I use different macros for each type of data, not something generic. I do this for bounding boxes, for example. A macro receives four values and saves them in 4 different symbols (I also save the index of the bounding box in another symbol, BTW, so that I can reference them later using their name):
Code:
   .macro Object_CreateBox _Name, _Top, _Bottom, _Left, _Right
      .ifndef Object::NextBox
         Object::NextBox .set 0
      .endif
      Object::.ident(.sprintf("%s_BOX", .string(_Name))) = Object::NextBox
      Object::.ident(.sprintf("Box%02xTop", Object::NextBox)) = _Top
      Object::.ident(.sprintf("Box%02xBottom", Object::NextBox)) = _Bottom
      Object::.ident(.sprintf("Box%02xLeft", Object::NextBox)) = _Left
      Object::.ident(.sprintf("Box%02xRight", Object::NextBox)) = _Right
      Object::NextBox .set Object::NextBox + 1
   .endmacro

And then I write the bytes to the ROM in the order I want:
Code:
   .macro Object_OutputBoxes _LabelTop, _LabelBottom, _LabelLeft, _LabelRight
      .ident(.string(_LabelTop)):
      .ifdef Object::NextBox
         .repeat Object::NextBox, BoundingBox
            .byte <Object::.ident(.sprintf("Box%02xTop", BoundingBox))
         .endrepeat
      .endif
      .ident(.string(_LabelBottom)):
      .ifdef Object::NextBox
         .repeat Object::NextBox, BoundingBox
            .byte <Object::.ident(.sprintf("Box%02xBottom", BoundingBox))
         .endrepeat
      .endif
      .ident(.string(_LabelLeft)):
      .ifdef Object::NextBox
         .repeat Object::NextBox, BoundingBox
            .byte <Object::.ident(.sprintf("Box%02xLeft", BoundingBox))
         .endrepeat
      .endif
      .ident(.string(_LabelRight)):
      .ifdef Object::NextBox
         .repeat Object::NextBox, BoundingBox
            .byte <Object::.ident(.sprintf("Box%02xRight", BoundingBox))
         .endrepeat
      .endif
   .endmacro

I also do this when registering the different types of objects, using this macro:
Code:
   .macro Object_Register _Name, _InitializationAddress, _LogicAddress, _DrawingAddress
      .ifndef Object::NextType
         Object::NextType .set 1
      .endif
      Object::.ident(.sprintf("%s_TYPE", .string(_Name))) = Object::NextType
      Object::.ident(.sprintf("InitializeType%02x", Object::NextType)) = _InitializationAddress
      Object::.ident(.sprintf("UpdateType%02x", Object::NextType)) = _LogicAddress
      Object::.ident(.sprintf("DrawType%02x", Object::NextType)) = _DrawingAddress
      Object::NextType .set Object::NextType + 1
   .endmacro

And then I use different macros to output the pointers arranged as I need them.

I suppose it's possible to create a generic solution that works something like this, assuming ca65 won't break due to the insane amount of symbols.


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 12:31 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
Location: Canada
tokumaru wrote:
Interesting. I remember seeing something like that straight out the ca65 documentation, but never thought about using it for larger pieces of data. It's still only useful for word values though.

There is an example in the documentation on how to use .lobytes etc. but the reason I never used it was that it seemed to be limited to a single line define. The missing piece of the puzzle for me was the .linecont feature.

Technically it can be used for 3 byte values (.bankbytes can be used to make a table from the 3rd byte), not just 2. I think support for a 4th byte could be easily added, actually, if someone out there had a need.

I don't think we'd want to extend this particular method beyond 4 bytes. It's just for breaking a single number into bytes, and I think what you really want beyond that is a convenient way to define a structure of arrays (e.g. for your box example, I think you want to specify 4 values, not a single 32-bit one). If there aren't too many objects it might be fine just to define them "horizontally", one column per instance? If there are too many to fit on one line, though, it's a bit of a pain (e.g. the various other solutions discussed, like your macros, an external tool, etc.).


Like, the generic solution is some way to transpose a matrix of code. Exchange rows for columns, so you can have one row per instance, instead of one column per instance. I was digging around in the source trying to think of a way to do this:

1. I could add a "stripe" attribute to a segment, and let the linker just rearrange the whole generated segment into stripes. There's a lot of ugly collateral here (moving symbols around, passing data back via defines, how and when to warn the user against improper use, etc.) but it seemed like the cleanest approach.

2. I looked at doing it at assemble time instead, with a .stripe and .endstripe directive, but this would be a lot of work to pull off. Data at the assembler level comes out in symbolic fragments, its not byte by byte (e.g. .word will produce 2 byte fragments), so what I was looking at was when .endstripe happens, rewind all those generated fragments and then rebuild them all for each stripe. Aside from all the work dealing with the various types of fragments, I don't think the assembler was really designed to be able to "rewind" once fragments are added; it would take a long time to figure out all the ramifications of doing this, I think.

3. I also thought about having something similar to .word, maybe I'd call it .sword that stores the low byte where you write it, and temporarily stores the high byte, and then later you put .swordhigh and it automatically emits all the high bytes. This seemed it would be easier to pull off but it's so close to the existing .lobytes/.hibytes feature that it seemed pointless once I noticed the .linecont possibility.


If it were a feature of ca65, what do you think it should look like?


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 12:57 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
Here's a generic way to use symbols for this:
Code:
.macro BufferData _Name, _Field0, _Field1, _Field2, _Field3, _Field4, _Field5, _Field6, _Field7
   .ifndef .ident(.sprintf("Next%sIndex", .string(_Name)))
      .ident(.sprintf("Next%sIndex", .string(_Name))) .set 0
   .endif
   BufferDataRecursive _Name, 0, _Field0, _Field1, _Field2, _Field3, _Field4, _Field5, _Field6, _Field7
   .ident(.sprintf("Next%sIndex", .string(_Name))) .set .ident(.sprintf("Next%sIndex", .string(_Name))) + 1
.endmacro

.macro BufferDataRecursive _Name, _FieldIndex, _Field0, _Field1, _Field2, _Field3, _Field4, _Field5, _Field6, _Field7
   .ifnblank _Field0
      .ident(.sprintf("%s%04x%02x", .string(_Name), .ident(.sprintf("Next%sIndex", .string(_Name))), _FieldIndex)) = _Field0
      BufferDataRecursive _Name, _FieldIndex + 1, _Field1, _Field2, _Field3, _Field4, _Field5, _Field6, _Field7
   .endif
.endmacro

.macro OutputData _Name, _FieldIndex, _ByteIndex
   .repeat .ident(.sprintf("Next%sIndex", .string(_Name))), ItemIndex
      .byte (.ident(.sprintf("%s%04x%02x", .string(_Name), ItemIndex, _FieldIndex)) >> (_ByteIndex * 8)) & $ff
   .endrepeat
.endmacro

With these macros you get to define up to 8 fields per item, each one being 32 bits (if I read the ca65 documentation correctly). When writing the data to the output file, you can select which field and what byte of that field to output.

For example, you could define some properties about your object types:
Code:
   BufferData Objects, UpdatePlayer, DrawPlayer, $10, $00
   BufferData Objects, UpdateEnemy, DrawEnemy, $02, $0a
   BufferData Objects, UpdateItem, DrawItem, $02, $40

And then you could output tables with the individual bytes of each field:
Code:
ObjectsUpdateLow:
   OutputData Objects, 0, 0
ObjectsUpdateHigh:
   OutputData Objects, 0, 1
ObjectsDrawLow:
   OutputData Objects, 1, 0
ObjectsDrawHigh:
   OutputData Objects, 1, 1
ObjectsHealthPoints:
   OutputData Objects, 2, 0
ObjectsWhatever:
   OutputData Objects, 3, 0

And here's an example for metatiles:
Code:
   BufferData Metatiles, $00, $01, $02, $03, %10101010, COLLISION_SOLID
   BufferData Metatiles, $04, $05, $06, $07, %00000000, COLLISION_EMPTY
   BufferData Metatiles, $08, $09, $0a, $0b, %11111111, COLLISION_WATER

Code:
MetatilesTopLeft:
   OutputData Metatiles, 0, 0
MetatilesTopRight:
   OutputData Metatiles, 1, 0
MetatilesBottomLeft:
   OutputData Metatiles, 2, 0
MetatilesBottomRight:
   OutputData Metatiles, 3, 0
MetatilesAttributes:
   OutputData Metatiles, 4, 0
MetatilesCollision:
   OutputData Metatiles, 5, 0


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 1:05 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
Location: Canada
Yeah, that looks nice.


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 1:23 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
rainwarrior wrote:
If it were a feature of ca65, what do you think it should look like?

I'm not sure. For smaller things that I write by hand, like object types and bounding boxes, the macro approach with one entry per line is actually pretty readable and maintainable. The big problem for me is doing this with big chunks of data that I normally .incbin. Maybe .incbin could take some parameters describing how to rearrange the contents of the file?

For example, I once wanted to copy CHR data from the ROM directly to VRAM, and doing it as fast as possible required the use of absolute indexed addressing in an unrolled loop (8 cycles per byte). Since absolute indexed addressing only allows you to access 256 positions from a base address, that wouldn't work for reading tiles from dynamic addresses. The solution was to rearrange the tiles, breaking them up in 256 groups of 4 tiles (64 bytes per group, 16384 bytes total), and use the index register to index groups, instead of bytes. In the ROM, the data had to be stored like: the first byte of every group, the second byte of every group, and so on. The code looked like this:

Code:
   lda $c000, x
   sta $2007
   lda $c100, x
   sta $2007
   ;(...)
   lda $fe00, x
   sta $2007
   lda $ff00, x
   sta $2007

It would be nice if I could have a normal CHR file that the assembler would shuffle around when .incbin'ing it, but I don't know if that would help much. There's still the issue of labeling the individual lists, and I have no idea how I'd want to do that.


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 2:12 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
Location: Canada
Well, it seems like that all of this would generically fall under having random access storage available at assemble time. That's really what your use of .ident + .sprintf is simulating. Being able to load an .incbin into random access storage would probably be relatively easy if you had that. Would probably be able to cut down on a lot of the overhead that way too, if that seems to be a problem.

Maybe something like .array name, size to allocate one to use with .repeat or macros, and .arraybin name, file that does the same but also imports a binary file into it?


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 3:42 am 
Offline
User avatar

Joined: Mon Jan 03, 2005 10:36 am
Posts: 2963
Location: Tampere, Finland
Another way to achieve the same thing is something like this:

Code:
; Macro to run each token of "v" through "func"
.macro transform v, func
    .repeat .tcount( {v} ), i
        func { .mid( i, 1, {v} ) }
    .endrepeat
.endmacro

.macro lobyte v
    .byte .lobyte( v )
.endmacro

.macro hibyte v
    .byte .hibyte( v )
.endmacro

; NOTE: No commas, since transform processes each token
.define stuff label1 label2 label3

los: transform { stuff }, lobyte
his: transform { stuff }, hibyte

This might seem redundant since .lobytes/.hibytes exists, but the advantage here is that this can be extended to handle arbitrary transformations.

Also note that this macro can't handle arbitrary expressions (e.g. label1+123) in the list, but could be modified to do so (scan the token list looking for commas).

_________________
Download STREEMERZ for NES from fauxgame.com! — Some other stuff I've done: kkfos.aspekt.fi


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 8:50 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
rainwarrior wrote:
Maybe something like .array name, size to allocate one to use with .repeat or macros, and .arraybin name, file that does the same but also imports a binary file into it?

I'm having a little trouble visualizing how that'd work... Can you write an example?

One feature I would find particularly useful would be to create a block, telling the assembler that it must be divided into N arrays, optionally named Label[0] to Label[N-1]. Something like this:

Code:
.arrays 5 ObjectsUpdateLow, ObjectsUpdateHigh, ObjectsDrawLow, ObjectsDrawHigh, ObjectsHealthPoints
   .word UpdatePlayer, DrawPlayer
   .byte PLAYER_HEALTH
   .word UpdateEnemy, DrawEnemy
   .byte ENEMY_HEALTH
.endarrays

.arrays 256
   .incbin "tiles.chr"
.endarrays

But this still wouldn't be as flexible as the macros, because all the arrays would be together, and bytes would be output in the same order (low to high). I'm currently outputting the updating addresses and the drawing addresses to different banks, actually, so this wouldn't solve that problem.


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 9:10 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
One way to retain all the functionality of the macros would be this:

Code:
.structureofarrays ObjectStructure, 5
   .word UpdatePlayer, DrawPlayer
   .byte PLAYER_HEALTH
   .word UpdateEnemy, DrawEnemy
   .byte ENEMY_HEALTH
.endstructureofarrays

;write each array separately
.writearray ObjectStructure, 0, ObjectsUpdateLow
.writearray ObjectStructure, 1, ObjectsUpdateHigh
.writearray ObjectStructure, 2, ObjectsDrawLow
.writearray ObjectStructure, 3, ObjectsDrawHigh
.writearray ObjectStructure, 4, ObjectsHealthPoints

;write all arrays at once
.writearrays ObjectStructure, ObjectsUpdateLow, ObjectsUpdateHigh, ObjectsDrawLow, ObjectsDrawHigh, ObjectsHealthPoints

It's basically the same thing as the macros, and could probably even be implemented that way (if you use macros instead of .byte and .word, but there's no alternative for .incbin...), still using dynamic symbols to hold the data. Having it built in would certainly improve performance though, and avoid the awkward use of symbols.


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 10:26 am 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
Location: Canada
tokumaru wrote:
I'm having a little trouble visualizing how that'd work... Can you write an example?

I just mean instead of abusing .ident + .sprintf to store arrays of data in the symbol list, you'd have a feature for it.

I'm imagining something like this:
Code:
;
; interface
;


; allocates and/or clears to zero a "macro array" in the specified slot (0-255)
; unallocated slots begin with size 0, out of bounds access will result in an error
; parameters must be constant at assemble time
.marray slot, size

; allocates a macro array to the size of a binary file and fills it with that file's bytes
; (stores 1 byte at each index of the array, even though macro arrays can store 32 bit values)
.mbin slot, filename

; returns the allocated length of a macro array
.mlen(slot)

; stores a value in a macro array (32 bit)
; must be constant at assemble time
.mstore value,slot,index

; fetches a byte from a macro array
; constant at assemble time, equivalent to using a literal
.mload(slot,index)


;
; usage
;


; example 1:
; transposing rows of object structures into a structure of arrays

.marray 0, 256
.marray 1, 256
.marray 2, 256
.marray 3, 256
object_count .set 0

.macro object_row p0, p1, p2, p3
   .mstore p0, 0, object_count
   .mstore p1, 1, object_count
   .mstore p2, 2, object_count
   .mstore p3, 3, object_count
   object_count .set object_count + 1
.endmacro

.macro object_column strip
   .repeat object_count, I
      .byte .mload(strip,I)
   .endrepeat
.endmacro

; define the objects
object_row  1, 5, 9,13
object_row  2, 6,10,14
object_row  3, 7,11,15
object_row  4, 8,12,16

; build the striped table
object_column0: object_column (0)
object_column1: object_column (1)
object_column2: object_column (2)
object_column3: object_column (3)


; example 2:
; automatically converting CHR from 8x8 to 8x16 tile arrangement

chrbin = 4
.mbin chrbin, "tiles.bin"

.segment "CHR"
.repeat .mlen(chrbin), I
   .byte .mload(chrbin, ((I & %100000000) >> 4) | ((I & %10000) << 4) | (I & %11111011101111))
.endrepeat

I think an interface like this would be relatively easy to implement in ca65, more versatile and more internally efficient than the .ident + .sprintf approach.


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 4:51 pm 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
I see... You'd still use macros to control everything, but you'd have a proper data structure where to temporarily store things instead of hacking something using symbols. Sounds good. I would probably still want to create arrays dynamically though.

For now I guess I'm gonna implement a version of my generic macros as part of my "assembler extensions" package, and make use of them to simplify the code that deals with the specific cases.


Top
 Profile  
 
PostPosted: Sat Sep 17, 2016 5:55 pm 
Online
User avatar

Joined: Mon Feb 07, 2011 12:46 pm
Posts: 930
rainwarrior wrote:
zzo38 mentioned that he'd implemented it in his Unofficial-MagicKit assembler, and I lamented that it wasn't doable in ca65.
Actually, even in Unofficial-MagicKit, macros (or custom output routines) must be used; I added the necessary features to make the macros capable (and added support for custom output routines). If used with INCBIN, a custom output routine must be used because a macro alone won't do in that case. (However, the Unofficial-MagicKit way is much cleaner than how it would currently work with ca65.)

It is good that you can now implement a proper way in ca65 too. Your proposed interface looks like it would make it clean enough, and then it wouldn't be more messy than my own implementation.

tokumaru wrote:
For example, I once wanted to copy CHR data from the ROM directly to VRAM, and doing it as fast as possible required the use of absolute indexed addressing in an unrolled loop (8 cycles per byte). Since absolute indexed addressing only allows you to access 256 positions from a base address, that wouldn't work for reading tiles from dynamic addresses. The solution was to rearrange the tiles, breaking them up in 256 groups of 4 tiles (64 bytes per group, 16384 bytes total), and use the index register to index groups, instead of bytes. In the ROM, the data had to be stored like: the first byte of every group, the second byte of every group, and so on.
This is the kind of thing I have suggested too, I think on the other topic, about name table data (it works for pattern table data too).

_________________
.


Top
 Profile  
 
PostPosted: Sun Sep 18, 2016 7:08 am 
Offline
User avatar

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
I tried using my macros to create a table with 5000 entries of 5 bytes, and the ROM takes about 5 seconds to assemble here. Doesn't sound so bad if this data is being assembled separately whenever it changes, but having to wait several seconds for each build doesn't sound nice at all.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 26 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: No registered users and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group