nesdev.com
https://forums.nesdev.com/

Nametable => Attribute table address conversion?
https://forums.nesdev.com/viewtopic.php?f=10&t=12398
Page 1 of 1

Author:  Myask [ Tue Feb 17, 2015 4:51 pm ]
Post subject:  Nametable => Attribute table address conversion?

Is there an efficient way that the community's already coded to efficiently perform this conversion? I'm pretty sure this naïf-approach amalgamation of shifts and carrysets is not the best way.
0010 TTYY YyyX XXxx (nametable byte)
to
0010 TT11 11YY YXXX (attr. table byte)
Code:
;$palettedex already has the desired palette in low two bits, from figuring out what we're putting there
LDA $highaddressin
ORA #$0C
ASL $lowaddressin ;Y yyXX Xxx0
ROL A ;* ***1 1YYY
ASL $lowaddressin ;y yXXX xx00
ROL $palettedex ;* 0000 0ppy
ASL $lowaddressin ;y XXXx x000 (LSB discard)
ASL $lowaddressin ;X XXxx 0000
ROL A ;* **11 YYYX
ASL $lowaddressin ;X Xxx0 0000
ROL A ;* *11Y YYXX
ASL $lowaddressin ;X xx00 0000
ROL A ;* 11YY YXXX
STA $lowaddressout
ASL $lowaddressin ;x x000 0000
ROL $palettedex ;* 0000 ppyx
LDA $highaddressin
ORA #3 ;0010 TT11
STA $highaddressout

Also, the lookup from the a1,a6 bits for which bit-pair of the attribute byte? It seems like for this, a 16-entry LUT (with the two palette bits also in the index) for speed would be best...
Code:
.db $00, $00, $00, $00, $01, $04, $10, $40, $02, $08, $20, $80, $03, $0C, $30, $C0

Though, on thinking, one could replace the "ROL $patterndex"es with
Code:
BCC +  ;y1
ASL $palettedex ASL $palettedex ASL $palettedex ASL $palettedex
+:
;and
BCC + ;x1
 ASL $palettedex ASL $palettedex
 +:

in the address conversion routine, for slower/smaller way to accomplish the same thing.

Author:  lidnariq [ Tue Feb 17, 2015 5:52 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

Only thought: for speed, at least, think about restructuring things to avoid the RMW-on-memory instructions; they're enough slower that you may find the overhead of temporarily loading and storing A to be preferable.

Author:  tokumaru [ Tue Feb 17, 2015 7:10 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

I just don't do this conversion, ever. What I do is convert a set of coordinates (9-bit X, 9-bit Y) into NT and AT addresses, but I really don't see the point in converting one address to another.

Author:  rainwarrior [ Tue Feb 17, 2015 8:33 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

I don't have 6502 code to convert, but a bitwise conversion is outlined at the skinny scrolling wiki article.

Code:
 tile address      = 0x2000 | (v & 0x0FFF)
 attribute address = 0x23C0 | (v & 0x0C00) | ((v >> 4) & 0x38) | ((v >> 2) & 0x07)

Author:  rainwarrior [ Tue Feb 17, 2015 8:49 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

So... this would be my naive approach:

Code:
; tile address in:
; v0 = H7 H6 H5 H4 H3 H2 H1 H0
; v1 = L7 L6 L5 L4 L3 L2 L1 L0

; attribute address out:
; a0 = 0  0  1  0  H3 H2 1  1   ($23 | calculations)
; a1 = 1  1  H1 H0 L7 L4 L3 L2  ($C0 | calculations)

lda v0
ror
ror
ror
ror
ror
and #$30
sta a1     ; a1 = 0  0  H1 H0 0  0  0  0
lda v1
lsr
lsr
pha
and #$07
ora a1
sta a1     ; a1 = 0  0  H1 H0 0  L4 L3 L2
pla
lsr
lsr
and #$08
ora a1     ; A  = 0  0  H1 H0 L7 L4 L3 L2
ora #$C0
sta a1     ; a1 = 1  1  H1 H0 L7 L4 L3 L2
lda v0
and #$0C
ora #$23
sta a0     ; a0 = 0  0  1  0  H3 H2 1  1 

61 cycles.

Author:  rainwarrior [ Tue Feb 17, 2015 9:12 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

Or, yeah, if I just modify the address in-place, I think the following would be 53 cycles.

Code:
; v0:v1 = address in
; t     = temporary

lda v0    ; A:v1 = v
lsr
ror v1
lsr
ror v1    ; A:v1 = v >> 2
lda v1
and #$07
sta t     ; t  = (v >> 2) & $07
lda v1    ; v1 = (v >> 2) & $FF
lsr
lsr
and #$38  ; A  = (v >> 4) & $38
ora t
ora #$C0
sta v1    ; v1 = ((v >> 2) & $07) | ((v >> 4) & $38) | #$C0
lda v0
and #$0C
ora #$23
sta v0    ; v0 = ((v & $0C00) | $2300) >> 8


Do you need to do this more than once per frame? What is use case that your original post wasn't efficient enough for?

Another alternative: (49 cycles)
Code:
lda v1
rol       ; C = L7
lda v0
pha
and #$0C
ora #$23
sta v0    ; A = 0  0  1  0  H3 H2 1  1 
pla
ror       ; A = L7 .  .  .  .  .  .  H1, C = H0
ror
ror       ; A = H1 H0 L7 .  .  .  .  .
ror
ror
and #$38
sta t     ; t = .  .  H1 H0 L7 .  .  .
lda v1
lsr
lsr
and #$07  ; A = .  .  .  .  .  H4 H3 H2
ora t
sta v1


Pardon me coding out loud. I'm just enjoying the exercise.

Author:  rainwarrior [ Tue Feb 17, 2015 9:51 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

Oh, sorry, I totally missed that you needed the palette index too. Modified that last routine, which is now 63 cycles.

Code:
lda v1
rol
rol       ; A = .  .  .  .  .  .  .  L7, C = L6
rol p     ; p = 0  0  0  0  0  P1 P0 L6
ror       ; C = L7
lda v0
pha
and #$0C
ora #$23
sta v0    ; A = 0  0  1  0  H3 H2 1  1
pla
ror       ; A = L7 .  .  .  .  .  .  H1, C = H0
ror
ror       ; A = H1 H0 L7 .  .  .  .  .
ror
ror
and #$38
sta t     ; t = .  .  H1 H0 L7 .  .  .
lda v1
lsr       ; C = L0
rol p     ; p = 0  0  0  0  P1 P0 L6 L0
lsr
and #$07  ; A = .  .  .  .  .  H4 H3 H2
ora t
sta v1


I think your original was 68 cycles, so this isn't much of an improvement anyway. Sorry.

Author:  Myask [ Tue Feb 17, 2015 10:04 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

Code:
my first
LDA/STA zp   : 3c x 4 = 12c      2b x 4 =  8
ORA #      : 2c x 2 =  4c      2b x 2 =  4
ROL A      : 2c x 4 =  8c      1b x 4 =  4
ASL/ROL zp   : 5c x 9 = 45c      2b x 9 = 18
                 69 cycles, 34 bytes
vs (your first)
lda/sta zp   : 4 x 6 = 24      2b x 6 = 12
pha/pla      : 2 x 4 =  8      1b x 4 =  4
ROR/LSR A    : 2 x 9 = 18      1b x 9 =  9
AND/ORA #i   : 2 x 6 = 12      2b x 6 = 12
AND/ORA zp   : 3 x 2 =  6      2b x 2 =  4
                 68 cycles, 41 bytes (66/40 rotating faster)
vs (your second)
LDA/STA zp: 4c x 7 = 28c   2b x 7 = 14b
LSR/ROR A :   2c x 4 =  8c   1b x 4 =  4b
ROR/LSR zp: 5c x 2 = 10c   2b x 2 =  4b
AND/OR #i : 2c x 5 = 10c   2b x 5 = 10b
ORA zp     : 3c x 1 =  3c   2b x 1 =  2b
            59 cycles,       34 bytes

Absolute commands are a byte and cycle bigger than zp: (47b,82c) vs (48b,74c) vs (44b,69c) making your [first] code better in both metrics.
(pre-post edit:)
(34b 52c) -> 42b 62c for non-zp
(32b 49c) -> 37b 54c if our address is not zp. (like t wouldn't be.)

rainwarrior wrote:
Pardon me coding out loud. I'm just enjoying the exercise.

Certainly.

(Simple optimization to your first: ASL A x 4 instead of ROL A x 5 ,since you're already ANDing off unused bits, and you don't have to put everything through the carry.)

tokumaru wrote:
What I do is convert a set of coordinates (9-bit X, 9-bit Y) into NT and AT addresses
An efficient way to do that would be just as pertinent here...though one wonders why 9-bit and not 6 if we're dealing with tiles. (self-reply: Sprites vs BG, duh.) Of course, that makes the NT address 0010 YXyy yyyx xxxx, a bit annoying... (that is, Y8, X8, Y7-3, X7-3)

rainwarrior wrote:
Do you need to do this more than once per frame? What is use case that your original post wasn't efficient enough for?
I was mainly curious if people had code-golfed it down to a gold-standard, since it seemed like a thing that every program ever needed to do. (Though on reflection I suppose one only needs to do it once per strip pushed to the PPU, breaking strips on nametable boundaries...)

My code is still in the theory stage. >>; I suspect going further would be prudent before we get too engaged in Virtua Code Golf.

Code:
convert-coord-to-NT:
LDA $obj-hiY
ASL A
ORA $obj-hiX ;maybe store hiXY together and use BCC+XOR to change each
          ; to avoid carry issues?
          ;Nah, makes collision sizes other than 'equality' hard.
ORA #$08
STA $high-addr
LDA $obj-loY
ASL A
ROL $high-addr
ASL A
ROL $high-addr
AND $E0
STA $low-addr
LDA $obj-loX
LSR LSR LSR
ORA $low-addr
STA $low-addr

convert-coord-to-AT:
LDA $obj-hiY ;though I suspect these might get ,x
ASL A
ORA $obj-hiX
ASL A  ASL A
ORA #$23
STA $high-addr
LDA $obj-loX
ASL A
ROL $low-addr
ASL A
ROL $low-addr
ASL A
ROL $low-addr
LDA #7
AND $low-addr
STA $low-addr
LDA $obj-loY
AND #$E0
LSR LSR
ORA #$C0
ORA $low-addr
STA $low-addr

rainwarrior wrote:
(palette-index too)

But shouldn't p be getting L7 L1? The LSB of X/Y (L6, L0) for tiles is ignored within the same 16x16 palette block. L7/L1 select which pair of bits you need to set for that 16x16.

Author:  rainwarrior [ Tue Feb 17, 2015 10:20 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

As for your palette lookup conversion, comparing size and speed for what you originally posted:

inline version: +12 bytes
cycles added: -4, 5, 15, 17 (8.25 average case)

lookup version: +19 bytes (16 byte table, LDA ABS X)
cycles added: 4

This is presuming LDA ZP (if inline) vs LDX ZP, LDA ABS X (lookup) when you go to use the thing.

If you were using my last version of the code, the inline version is incompatible, I think.


However, if your goal is to do this many times per frame, optimizing these routines probably doesn't help much unless you need kinda random access to the tiles. If you're trying to update a contiguous row of tiles, you only need to do the calculation once at the start of the row, and then the thing to optimize is probably the continuation loop afterwards?

Author:  rainwarrior [ Tue Feb 17, 2015 10:25 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

Oops! Yes, I misread what was going on with the palette bits. A quick revision, no change in speed.

Code:
lda v1
rol
php
rol p     ; p = 0  0  0  0  0  P1 P0 L7
plp       ; C = L7
lda v0
pha
and #$0C
ora #$23
sta v0    ; A = 0  0  1  0  H3 H2 1  1
pla
ror       ; A = L7 .  .  .  .  .  .  H1, C = H0
ror
ror       ; A = H1 H0 L7 .  .  .  .  .
ror
ror
and #$38
sta t     ; t = .  .  H1 H0 L7 .  .  .
lda v1
lsr
lsr       ; C = L1
rol p     ; p = 0  0  0  0  P1 P0 L7 L1
and #$07  ; A = .  .  .  .  .  H4 H3 H2
ora t
sta v1

Author:  Myask [ Tue Feb 17, 2015 10:51 pm ]
Post subject:  Re: Nametable => Attribute table address conversion?

Use of X0/Y0 (your L0/L6) [technically, X3, Y3 as fine discard etc. etc.] in palettes for ExGrafx is possible but presently beyond me. Though, that would remove the need for the shift-table as they appear to always be in the uppermost two bits in ExGrafx anyway...and...one writes it DURING rendering? Weird.

And yes, we don't need to recalculate along a line; so long as we've broken it between tables, one can just perform simple additions to the attribute address of whatever we're loading up before we send to PPU...which increments by 1 per four horizontal tiles or 8 per four vertical.

Is it common practice to store a copy of attribute tables in CPU address space, so one doesn't have to re-read them to mask off/on palette bitpairs?

Author:  rainwarrior [ Wed Feb 18, 2015 12:41 am ]
Post subject:  Re: Nametable => Attribute table address conversion?

Yes, it's extremely common to keep attributes in RAM for updating. Reading back through $2007 probably isn't a good approach unless you're just trying to update one or two tiles in a frame.

Author:  Bregalad [ Wed Feb 18, 2015 1:09 am ]
Post subject:  Re: Nametable => Attribute table address conversion?

Personally what I do is that I use index by 4x4 or 2x2 metatiles, and "converts" them to AT or NT indexes using lookup tables. Converting them using shifts is always possible but so annoying and unpractical with the 6502 instruction set.

For example in the case of vertical mirroring it'd look like that :
Code:
NTLookupH_LSB
    .db $00, $04, $08, $10, $18, ....

NTLookupH_MSB
    .db $20, $20, $20, $20, ....., $24, $24, $24, $24, .....

NTLookupV_LSB
    .db $00, $40, $80, $c0, $00, .......

NTLookupV_MSB
     .db $20, $20, $20, $20, $21, ....

ATLookupH_LSB
     .db $00, $01, $02, $03, $04, ....  ; This table can be optimized out

ATLookupH_MSB
     .db $23, $23 ,$23, $23, $23, $23, ....., $27, $27, $27   ; This table can be optimized out (re-using NTLookupH_MSB and OR with #$03)

ATLookupV_LSB
      .db $00, $08, $10, $18, $20, ...

ATLookupV_MSB
       .db $23 ,$23, $23, $23, $23        ; This table should be optimized out (i.e. constant byte)


In the end since I optimize out half of the table it becomes a mixing of "traditional" shifting/compare and lookup tables, using the best of both.

To compute a NT or AT adress, just OR the values from the horizontal and vertical tables with the corresponding horizontal and vertical address.

Author:  tokumaru [ Wed Feb 18, 2015 10:03 am ]
Post subject:  Re: Nametable => Attribute table address conversion?

Myask wrote:
tokumaru wrote:
What I do is convert a set of coordinates (9-bit X, 9-bit Y) into NT and AT addresses
An efficient way to do that would be just as pertinent here...though one wonders why 9-bit and not 6 if we're dealing with tiles. (self-reply: Sprites vs BG, duh.)

I don't have my code with me right now, but I don't think it does anything particularly clever.

I use 9-bit coordinates because of the camera, since most of the time tiles are drawn around the edges of the camera as it scrolls through the level.

Now that I think of it, I don't think I ever needed to calculate the address of a "random" AT byte, because I always buffered the attribute tables in RAM and updated entire rows or columns of them at a time (it was actually faster than updating only the part that was on screen).

I would eventually need it when I implemented removable background objects (although some games, such as Somari, just draw blank tiles when items are collected, so there's no need to modify the attributes), but I didn't get that far.

Page 1 of 1 All times are UTC - 7 hours
Powered by phpBB® Forum Software © phpBB Group
http://www.phpbb.com/