Nametable => Attribute table address conversion?

Are you new to 6502, NES, or even programming in general? Post any of your questions here. Remember - the only dumb question is the question that remains unasked.

Moderator: Moderators

Post Reply
User avatar
Myask
Posts: 965
Joined: Sat Jul 12, 2014 3:04 pm

Nametable => Attribute table address conversion?

Post by Myask »

Is there an efficient way that the community's already coded to efficiently perform this conversion? I'm pretty sure this naïf-approach amalgamation of shifts and carrysets is not the best way.
0010 TTYY YyyX XXxx (nametable byte)
to
0010 TT11 11YY YXXX (attr. table byte)

Code: Select all

;$palettedex already has the desired palette in low two bits, from figuring out what we're putting there
LDA $highaddressin
ORA #$0C
ASL $lowaddressin ;Y yyXX Xxx0
ROL A ;* ***1 1YYY
ASL $lowaddressin ;y yXXX xx00
ROL $palettedex ;* 0000 0ppy
ASL $lowaddressin ;y XXXx x000 (LSB discard)
ASL $lowaddressin ;X XXxx 0000
ROL A ;* **11 YYYX
ASL $lowaddressin ;X Xxx0 0000
ROL A ;* *11Y YYXX
ASL $lowaddressin ;X xx00 0000
ROL A ;* 11YY YXXX
STA $lowaddressout
ASL $lowaddressin ;x x000 0000
ROL $palettedex ;* 0000 ppyx
LDA $highaddressin
ORA #3 ;0010 TT11
STA $highaddressout
Also, the lookup from the a1,a6 bits for which bit-pair of the attribute byte? It seems like for this, a 16-entry LUT (with the two palette bits also in the index) for speed would be best...

Code: Select all

.db $00, $00, $00, $00, $01, $04, $10, $40, $02, $08, $20, $80, $03, $0C, $30, $C0
Though, on thinking, one could replace the "ROL $patterndex"es with

Code: Select all

BCC +  ;y1
ASL $palettedex ASL $palettedex ASL $palettedex ASL $palettedex
+: 
;and
BCC + ;x1
 ASL $palettedex ASL $palettedex
 +:
in the address conversion routine, for slower/smaller way to accomplish the same thing.
Last edited by Myask on Tue Feb 17, 2015 8:53 pm, edited 2 times in total.
lidnariq
Posts: 11432
Joined: Sun Apr 13, 2008 11:12 am

Re: Nametable => Attribute table address conversion?

Post by lidnariq »

Only thought: for speed, at least, think about restructuring things to avoid the RMW-on-memory instructions; they're enough slower that you may find the overhead of temporarily loading and storing A to be preferable.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Nametable => Attribute table address conversion?

Post by tokumaru »

I just don't do this conversion, ever. What I do is convert a set of coordinates (9-bit X, 9-bit Y) into NT and AT addresses, but I really don't see the point in converting one address to another.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Nametable => Attribute table address conversion?

Post by rainwarrior »

I don't have 6502 code to convert, but a bitwise conversion is outlined at the skinny scrolling wiki article.

Code: Select all

 tile address      = 0x2000 | (v & 0x0FFF)
 attribute address = 0x23C0 | (v & 0x0C00) | ((v >> 4) & 0x38) | ((v >> 2) & 0x07)
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Nametable => Attribute table address conversion?

Post by rainwarrior »

So... this would be my naive approach:

Code: Select all

; tile address in:
; v0 = H7 H6 H5 H4 H3 H2 H1 H0
; v1 = L7 L6 L5 L4 L3 L2 L1 L0

; attribute address out:
; a0 = 0  0  1  0  H3 H2 1  1   ($23 | calculations)
; a1 = 1  1  H1 H0 L7 L4 L3 L2  ($C0 | calculations)

lda v0
ror
ror
ror
ror
ror
and #$30
sta a1     ; a1 = 0  0  H1 H0 0  0  0  0
lda v1
lsr
lsr
pha
and #$07
ora a1
sta a1     ; a1 = 0  0  H1 H0 0  L4 L3 L2
pla
lsr
lsr
and #$08
ora a1     ; A  = 0  0  H1 H0 L7 L4 L3 L2
ora #$C0
sta a1     ; a1 = 1  1  H1 H0 L7 L4 L3 L2
lda v0
and #$0C
ora #$23
sta a0     ; a0 = 0  0  1  0  H3 H2 1  1  
61 cycles.
Last edited by rainwarrior on Tue Feb 17, 2015 9:34 pm, edited 1 time in total.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Nametable => Attribute table address conversion?

Post by rainwarrior »

Or, yeah, if I just modify the address in-place, I think the following would be 53 cycles.

Code: Select all

; v0:v1 = address in
; t     = temporary

lda v0    ; A:v1 = v
lsr
ror v1
lsr
ror v1    ; A:v1 = v >> 2
lda v1
and #$07
sta t     ; t  = (v >> 2) & $07
lda v1    ; v1 = (v >> 2) & $FF
lsr
lsr
and #$38  ; A  = (v >> 4) & $38
ora t
ora #$C0
sta v1    ; v1 = ((v >> 2) & $07) | ((v >> 4) & $38) | #$C0
lda v0
and #$0C
ora #$23
sta v0    ; v0 = ((v & $0C00) | $2300) >> 8
Do you need to do this more than once per frame? What is use case that your original post wasn't efficient enough for?

Another alternative: (49 cycles)

Code: Select all

lda v1
rol       ; C = L7
lda v0
pha
and #$0C
ora #$23
sta v0    ; A = 0  0  1  0  H3 H2 1  1  
pla
ror       ; A = L7 .  .  .  .  .  .  H1, C = H0
ror
ror       ; A = H1 H0 L7 .  .  .  .  .
ror
ror
and #$38
sta t     ; t = .  .  H1 H0 L7 .  .  .
lda v1
lsr
lsr
and #$07  ; A = .  .  .  .  .  H4 H3 H2
ora t
sta v1
Pardon me coding out loud. I'm just enjoying the exercise.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Nametable => Attribute table address conversion?

Post by rainwarrior »

Oh, sorry, I totally missed that you needed the palette index too. Modified that last routine, which is now 63 cycles.

Code: Select all

lda v1
rol
rol       ; A = .  .  .  .  .  .  .  L7, C = L6
rol p     ; p = 0  0  0  0  0  P1 P0 L6
ror       ; C = L7
lda v0
pha
and #$0C
ora #$23
sta v0    ; A = 0  0  1  0  H3 H2 1  1
pla
ror       ; A = L7 .  .  .  .  .  .  H1, C = H0
ror
ror       ; A = H1 H0 L7 .  .  .  .  .
ror
ror
and #$38
sta t     ; t = .  .  H1 H0 L7 .  .  .
lda v1
lsr       ; C = L0
rol p     ; p = 0  0  0  0  P1 P0 L6 L0
lsr
and #$07  ; A = .  .  .  .  .  H4 H3 H2
ora t
sta v1
I think your original was 68 cycles, so this isn't much of an improvement anyway. Sorry.
User avatar
Myask
Posts: 965
Joined: Sat Jul 12, 2014 3:04 pm

Re: Nametable => Attribute table address conversion?

Post by Myask »

Code: Select all

my first
LDA/STA zp	: 3c x 4 = 12c		2b x 4 =  8
ORA #		: 2c x 2 =  4c		2b x 2 =  4
ROL A		: 2c x 4 =  8c		1b x 4 =  4
ASL/ROL zp	: 5c x 9 = 45c		2b x 9 = 18
					  69 cycles, 34 bytes
vs (your first)
lda/sta zp	: 4 x 6 = 24		2b x 6 = 12
pha/pla		: 2 x 4 =  8		1b x 4 =  4
ROR/LSR A 	: 2 x 9 = 18		1b x 9 =  9
AND/ORA #i	: 2 x 6 = 12		2b x 6 = 12
AND/ORA zp	: 3 x 2 =  6		2b x 2 =  4
					  68 cycles, 41 bytes (66/40 rotating faster)
vs (your second)
LDA/STA zp: 4c x 7 = 28c	2b x 7 = 14b
LSR/ROR A :	2c x 4 =  8c	1b x 4 =  4b
ROR/LSR zp: 5c x 2 = 10c	2b x 2 =  4b
AND/OR #i : 2c x 5 = 10c	2b x 5 = 10b
ORA zp	  : 3c x 1 =  3c	2b x 1 =  2b
				59 cycles, 		34 bytes
Absolute commands are a byte and cycle bigger than zp: (47b,82c) vs (48b,74c) vs (44b,69c) making your [first] code better in both metrics.
(pre-post edit:)
(34b 52c) -> 42b 62c for non-zp
(32b 49c) -> 37b 54c if our address is not zp. (like t wouldn't be.)
rainwarrior wrote: Pardon me coding out loud. I'm just enjoying the exercise.
Certainly.

(Simple optimization to your first: ASL A x 4 instead of ROL A x 5 ,since you're already ANDing off unused bits, and you don't have to put everything through the carry.)
tokumaru wrote:What I do is convert a set of coordinates (9-bit X, 9-bit Y) into NT and AT addresses
An efficient way to do that would be just as pertinent here...though one wonders why 9-bit and not 6 if we're dealing with tiles. (self-reply: Sprites vs BG, duh.) Of course, that makes the NT address 0010 YXyy yyyx xxxx, a bit annoying... (that is, Y8, X8, Y7-3, X7-3)
rainwarrior wrote:Do you need to do this more than once per frame? What is use case that your original post wasn't efficient enough for?
I was mainly curious if people had code-golfed it down to a gold-standard, since it seemed like a thing that every program ever needed to do. (Though on reflection I suppose one only needs to do it once per strip pushed to the PPU, breaking strips on nametable boundaries...)

My code is still in the theory stage. >>; I suspect going further would be prudent before we get too engaged in Virtua Code Golf.

Code: Select all

convert-coord-to-NT:
LDA $obj-hiY
ASL A
ORA $obj-hiX ;maybe store hiXY together and use BCC+XOR to change each
			 ; to avoid carry issues?
			 ;Nah, makes collision sizes other than 'equality' hard.
ORA #$08
STA $high-addr
LDA $obj-loY
ASL A
ROL $high-addr
ASL A
ROL $high-addr
AND $E0
STA $low-addr
LDA $obj-loX
LSR LSR LSR
ORA $low-addr
STA $low-addr

convert-coord-to-AT:
LDA $obj-hiY ;though I suspect these might get ,x
ASL A
ORA $obj-hiX
ASL A  ASL A
ORA #$23
STA $high-addr
LDA $obj-loX
ASL A
ROL $low-addr
ASL A
ROL $low-addr
ASL A
ROL $low-addr
LDA #7
AND $low-addr
STA $low-addr
LDA $obj-loY
AND #$E0
LSR LSR
ORA #$C0
ORA $low-addr
STA $low-addr
rainwarrior wrote:(palette-index too)
But shouldn't p be getting L7 L1? The LSB of X/Y (L6, L0) for tiles is ignored within the same 16x16 palette block. L7/L1 select which pair of bits you need to set for that 16x16.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Nametable => Attribute table address conversion?

Post by rainwarrior »

As for your palette lookup conversion, comparing size and speed for what you originally posted:

inline version: +12 bytes
cycles added: -4, 5, 15, 17 (8.25 average case)

lookup version: +19 bytes (16 byte table, LDA ABS X)
cycles added: 4

This is presuming LDA ZP (if inline) vs LDX ZP, LDA ABS X (lookup) when you go to use the thing.

If you were using my last version of the code, the inline version is incompatible, I think.


However, if your goal is to do this many times per frame, optimizing these routines probably doesn't help much unless you need kinda random access to the tiles. If you're trying to update a contiguous row of tiles, you only need to do the calculation once at the start of the row, and then the thing to optimize is probably the continuation loop afterwards?
Last edited by rainwarrior on Tue Feb 17, 2015 10:30 pm, edited 1 time in total.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Nametable => Attribute table address conversion?

Post by rainwarrior »

Oops! Yes, I misread what was going on with the palette bits. A quick revision, no change in speed.

Code: Select all

lda v1
rol
php
rol p     ; p = 0  0  0  0  0  P1 P0 L7
plp       ; C = L7
lda v0
pha
and #$0C
ora #$23
sta v0    ; A = 0  0  1  0  H3 H2 1  1
pla
ror       ; A = L7 .  .  .  .  .  .  H1, C = H0
ror
ror       ; A = H1 H0 L7 .  .  .  .  .
ror
ror
and #$38
sta t     ; t = .  .  H1 H0 L7 .  .  .
lda v1
lsr
lsr       ; C = L1
rol p     ; p = 0  0  0  0  P1 P0 L7 L1
and #$07  ; A = .  .  .  .  .  H4 H3 H2
ora t
sta v1
User avatar
Myask
Posts: 965
Joined: Sat Jul 12, 2014 3:04 pm

Re: Nametable => Attribute table address conversion?

Post by Myask »

Use of X0/Y0 (your L0/L6) [technically, X3, Y3 as fine discard etc. etc.] in palettes for ExGrafx is possible but presently beyond me. Though, that would remove the need for the shift-table as they appear to always be in the uppermost two bits in ExGrafx anyway...and...one writes it DURING rendering? Weird.

And yes, we don't need to recalculate along a line; so long as we've broken it between tables, one can just perform simple additions to the attribute address of whatever we're loading up before we send to PPU...which increments by 1 per four horizontal tiles or 8 per four vertical.

Is it common practice to store a copy of attribute tables in CPU address space, so one doesn't have to re-read them to mask off/on palette bitpairs?
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Nametable => Attribute table address conversion?

Post by rainwarrior »

Yes, it's extremely common to keep attributes in RAM for updating. Reading back through $2007 probably isn't a good approach unless you're just trying to update one or two tiles in a frame.
User avatar
Bregalad
Posts: 8056
Joined: Fri Nov 12, 2004 2:49 pm
Location: Divonne-les-bains, France

Re: Nametable => Attribute table address conversion?

Post by Bregalad »

Personally what I do is that I use index by 4x4 or 2x2 metatiles, and "converts" them to AT or NT indexes using lookup tables. Converting them using shifts is always possible but so annoying and unpractical with the 6502 instruction set.

For example in the case of vertical mirroring it'd look like that :

Code: Select all

NTLookupH_LSB
    .db $00, $04, $08, $10, $18, ....

NTLookupH_MSB
    .db $20, $20, $20, $20, ....., $24, $24, $24, $24, .....

NTLookupV_LSB
    .db $00, $40, $80, $c0, $00, .......

NTLookupV_MSB
     .db $20, $20, $20, $20, $21, ....

ATLookupH_LSB
     .db $00, $01, $02, $03, $04, ....  ; This table can be optimized out

ATLookupH_MSB
     .db $23, $23 ,$23, $23, $23, $23, ....., $27, $27, $27   ; This table can be optimized out (re-using NTLookupH_MSB and OR with #$03)

ATLookupV_LSB
      .db $00, $08, $10, $18, $20, ...

ATLookupV_MSB
       .db $23 ,$23, $23, $23, $23        ; This table should be optimized out (i.e. constant byte)
In the end since I optimize out half of the table it becomes a mixing of "traditional" shifting/compare and lookup tables, using the best of both.

To compute a NT or AT adress, just OR the values from the horizontal and vertical tables with the corresponding horizontal and vertical address.
User avatar
tokumaru
Posts: 12427
Joined: Sat Feb 12, 2005 9:43 pm
Location: Rio de Janeiro - Brazil

Re: Nametable => Attribute table address conversion?

Post by tokumaru »

Myask wrote:
tokumaru wrote:What I do is convert a set of coordinates (9-bit X, 9-bit Y) into NT and AT addresses
An efficient way to do that would be just as pertinent here...though one wonders why 9-bit and not 6 if we're dealing with tiles. (self-reply: Sprites vs BG, duh.)
I don't have my code with me right now, but I don't think it does anything particularly clever.

I use 9-bit coordinates because of the camera, since most of the time tiles are drawn around the edges of the camera as it scrolls through the level.

Now that I think of it, I don't think I ever needed to calculate the address of a "random" AT byte, because I always buffered the attribute tables in RAM and updated entire rows or columns of them at a time (it was actually faster than updating only the part that was on screen).

I would eventually need it when I implemented removable background objects (although some games, such as Somari, just draw blank tiles when items are collected, so there's no need to modify the attributes), but I didn't get that far.
Post Reply