It is currently Tue Oct 24, 2017 2:47 am

 All times are UTC - 7 hours

 Page 1 of 1 [ 11 posts ]
 Print view Previous topic | Next topic
Author Message
 Post subject: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Jan 30, 2014 1:26 pm

Joined: Sun Apr 04, 2010 4:28 pm
Posts: 91
For my game, I want to be able to freely scroll in all four directions (single screen mirroring makes this easy, and the graphical artifacts aren't that bad). X scrolling is great, but Y scrolling is problematic.

I think what I need to do is divide my 16bit y scroll value by 240, and use the remainder as the scroll value that I use when resetting the scroll during NMI.

Two questions: First, is this right - or am I barking up the wrong tree entirely? Second, is there a 16bit modulus algorithm that I could use that wouldn't be too slow to use once-per-frame?

Thanks so much!

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Jan 30, 2014 1:30 pm

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
I don't know what kind of input data you have, but in a some cases something like this might be good enough for that:
Code:
while (y >= 240) y -= 240;

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Jan 30, 2014 1:56 pm

Joined: Wed Apr 02, 2008 2:09 pm
Posts: 1021
I have two bytes for both the regular 16 bit scroll (scrolly), and for the y scroll whose low byte wraps at > 239. (scrollyscreen)

I add the same value to both whenever the game scrolls on the y axis. For scrollyscreen, I just check if the low byte is higher than 239 and add an extra 16 in that case, then update the high byte accordingly.

For subtraction, when the carry is clear after passing zero, subtract an additional 16.

All that said, I also have subroutine that can pull the correct scrollyscreen value from the regular scrolly. But I use it only when the regular scroll is forced in bounds by the bottom of the level map. (And only for safety, I'm pretty sure even that's not needed with how I do things.) You can just subtract from scrollyscreen how far scrolly was ejected and never need to call this.

Here it is anyway:
Code:
pullyscreenfromyscroll:;{
ldx #\$FF;FF because an extra 1 is always added
lda <scrollyhigh

sec
inx;We add 1 to X for every time \$0F can be subtracted from scrollyhigh
sbc #\$0F

ldy highzerotosixteen,x;We then "add" \$10 for every \$0F in scrollyscreenhigh
sty <scrollyscreenhigh

ldx <scrollylow

adc #\$0F;Fixing the remainder, since an extra \$0F is always subtracted
beq yscreenfromyscrollfinalcheck
tay;Store the remainder in Y
txa
yscreenfromyscrollhighcheck.loopcheck:
sec
yscreenfromyscrollhighcheck.2loopcheck:
sbc #\$F0
inc <scrollyscreenhigh

bcs yscreenfromyscrollhighcheck.2loopcheck
dey
bne yscreenfromyscrollhighcheck.loopcheck
tax
yscreenfromyscrollfinalcheck:
txa
cmp #\$F0
;sec
sbc #\$F0
inc <scrollyscreenhigh

sta <scrollyscreenlow

rts;}

highzerotosixteen:
.db \$00, \$10, \$20, \$30, \$40, \$50, \$60, \$70
.db \$80, \$90, \$A0, \$B0, \$C0, \$D0, \$E0, \$F0

It could probably be much faster, but what I've got is WAY faster than how it used to be, heh. And that was when it was called every frame.

It depends on how big your level is (and also probably fails if your level is taller than 239 screens), but in the bottom of my tallest vertical level (2304 pixels tall) at the moment it runs at 164 cycles. Certainly fast enough to call every frame, I'd say.

Edit: Hah, this can probably be made faster by messing with the txa/tax instructions. But honestly? I'm too scared of breaking it right now.
This seems like it'd work:
Edit 2: Nope, it wouldn't so removed it. There's gotta be a way, but the savings would be miniscule for that small change anyway. New, faster methods are welcome, though. I'm not exactly a 6502 magician.

_________________
https://kasumi.itch.io/indivisible

Last edited by Kasumi on Thu Jan 30, 2014 2:22 pm, edited 2 times in total.

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Jan 30, 2014 2:04 pm

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
pops wrote:
I think what I need to do is divide my 16bit y scroll value by 240, and use the remainder as the scroll value that I use when resetting the scroll during NMI.

I faced this problem once, and my solution was to get rid of the division altogether. For this I introduced a second Y scroll (it's called NTCameraY or something similar), which is relative to the name tables, and I update both values as the engine runs. NTCameraY always starts as 0, and is updated by the same amounts as the real CameraY every time it's changed. You can check for overflows and underflows of NTCameraY after each modification, to make sure it stays in the 0-240 range.

Quote:
Two questions: First, is this right - or am I barking up the wrong tree entirely? Second, is there a 16bit modulus algorithm that I could use that wouldn't be too slow to use once-per-frame?

I don't know of any tricks to do this faster than using an actual division (which can be optimized to some extent), which might not even be too slow to perform a couple of times per frame (you might need to calculate more than one Y coordinate per frame if you're updating a row at the bottom of the screen, for example), but I'd much rather get rid of the division.

rainwarrior wrote:
Code:
while (y >= 240) y -= 240;

This could be very slow in tall levels.

EDIT: Apparently my solution is very similar to Kasumi's!

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Jan 30, 2014 2:15 pm

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5736
tokumaru wrote:
rainwarrior wrote:
Code:
while (y >= 240) y -= 240;

This could be very slow in tall levels.

Yes, I don't recommend it for routinely large values of y, obviously, but what I was suggesting is that trying to do a modulus isn't necessarily the best way to go about this. As I said, I don't know what the OP has in mind, specifically, but there are a lot of ways you could potentially keep the values of y in a low range where a full modulus is no longer necessary.

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Jan 30, 2014 2:47 pm

Joined: Sun Apr 04, 2010 4:28 pm
Posts: 91
rainwarrior wrote:
I don't know what kind of input data you have, but in a some cases something like this might be good enough for that:
Code:
while (y >= 240) y -= 240;

That's a very simple and clever response. Maximum scroll would be 4096 (0-17 iterations) for the map size I'm currently working on, with a maximum size of 8192 (0-34 iterations) for larger maps.

For anyone else who has this problem, this is the code I wrote to get the modulus of a 16bit number divided by a constant value (here, 240).

Code:
.alias value_hi \$01
.alias value_lo \$00
`SetMemory value_hi, 1
`SetMemory value_lo, 1
ldx value_hi
ldy value_lo
Mod240X:
cpx #0      ; 2
bne +      ; 2/3
cpy #240   ; 2
bcs +      ; 2/3
rts         ; 6
*   sec         ; 2
tya         ; 2
sbc #240   ; 2
tay         ; 2
bcs Mod240X   ; 2/3
dex         ; 2
bcc Mod240X   ; 2/3

Not super speedy, but fast enough, I suppose, for running once a frame. Thanks rainwarrior!

tokumaru wrote:
I introduced a second Y scroll (it's called NTCameraY or something similar), which is relative to the name tables, and I update both values as the engine runs. NTCameraY always starts as 0, and is updated by the same amounts as the real CameraY every time it's changed. You can check for overflows and underflows of NTCameraY after each modification, to make sure it stays in the 0-240 range.

tokukaru, Kasumi, that's a great idea as well.

I think I'll use tokumaru's idea for frame-by-frame modification of ScrollY, but I'll still need the modulus solution for initially setting the value of ScrollY. Again - thanks so much for the excellent suggestions!

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Jan 30, 2014 3:37 pm

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
pops wrote:
I think I'll use tokumaru's idea for frame-by-frame modification of ScrollY, but I'll still need the modulus solution for initially setting the value of ScrollY. Again - thanks so much for the excellent suggestions!

I don't think you have to use the modulo at all (I don't, in my engine). Think about it: the only reason you want to calculate the initial scroll Y is because you want row 0 of the level map to be rendered to row 0 in the name table... but that doesn't bring any benefits. What good is having the very first row aligned to the top of the name table if the next 256-pixel boundary won't be? And the one after that won't be either? My point is that you'll lose sync with the name tables very soon after scrolling down a bit, so this alignment is completely irrelevant.

This is why I said that in my engine, no matter the value of ScrollY, NTScrollY always starts at 0-15 (i.e. the first row of metatiles), because I can start at any map vs. name table alignment I want. It doesn't matter if row 13 of the level map gets rendered to row 7 of the name table, because that will eventually happen anyway when the screen scrolls down during gameplay.

Unless your engine for some reason depends on this initial alignment, but I can't think of any reason for that. At first I too though I needed to vertically align the level map with the name tables, but after giving some thought to it I realized I really didn't. Please take a look at your engine and decide if this alignment really affects anything, because that might save you some ROM and a few cycles that could be better spent on actual game logic.

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Jan 30, 2014 3:45 pm

Joined: Wed Apr 02, 2008 2:09 pm
Posts: 1021
Quote:
but I'll still need the modulus solution for initially setting the value of ScrollY.

Or just use two bytes to store that in your level header, so when you load the level ScrollY can just be set to the right value immediately with no calculation whatsoever. (Well... whatever exports your level exporter would need to figure it out)

_________________
https://kasumi.itch.io/indivisible

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Jan 30, 2014 7:40 pm

Joined: Sun Apr 04, 2010 4:28 pm
Posts: 91
These are all excellent ideas, and if I were starting from scratch, I'd definitely try to implement them. However, my time to program is limited, and once I have something working without bugs, I'm loathe to touch it again. I now have my scrolling working, using a mixture of the modulus code and tracking a separate Screen_Y value that rolls over at 240, as opposed to Scroll_Y, which rolls over at 256.

Thanks again everyone! I really appreciate how helpful people in this community are.

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Fri Jan 31, 2014 6:56 am

Joined: Sat Feb 12, 2005 9:43 pm
Posts: 10068
Location: Rio de Janeiro - Brazil
Kasumi wrote:
Or just use two bytes to store that in your level header, so when you load the level ScrollY can just be set to the right value immediately with no calculation whatsoever.

...OR, you can set the secondary ScrollY to any value (it has to be metatile-align, but the upper bits don't matter at all) like I've been saying all along. Really, I dare you guy find one reason why the secondary Y scroll can't be anything when the scroll engine is initialized.

Top

 Post subject: Re: Fast 16bit modulus? (Y scroll for values > 240)Posted: Thu Apr 16, 2015 11:25 pm

Joined: Tue Jun 10, 2014 8:15 pm
Posts: 35
I ran across this topic. I thought about doing a quick mod 240 for 16 bit, and came up with this:

Code:
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;  16 bit mod 240
;  By Omegamatrix
;  39-42 cycles, 32 bytes
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

lda    dividendHigh          ;3  @3
and    #\$F0                  ;2  @5
sta    temp                  ;3  @8
eor    dividendHigh          ;3  @11
asl                          ;2  @13
asl                          ;2  @15
asl                          ;2  @17
asl                          ;2  @19
bcc    .doneHigh             ;2³ @24/25
adc    #(1 << 4) - 1         ;2  @26      -1 because carry is set
.doneHigh:
bcc    .try240               ;2³ @31/32
adc    #16-1                 ;2  @33      -1 because carry is set
.try240:
cmp    #240                  ;2  @35
bcc    .storeRemainder       ;2³ @37/38
sbc    #240                  ;2  @39
.storeRemainder:
sta    mod240                ;3  @42

It's built off of some ideas from Bogax, Jones on Modulus Without Division, and even a little bit of tepples for the EOR trick.

Whether it's useful or not I don't know. I just like doing these types of routines.

Top

 Display posts from previous: All posts1 day7 days2 weeks1 month3 months6 months1 year Sort by AuthorPost timeSubject AscendingDescending
 Page 1 of 1 [ 11 posts ]

 All times are UTC - 7 hours

#### Who is online

Users browsing this forum: Bing [Bot] and 3 guests

 You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot post attachments in this forum

Search for:
 Jump to:  Select a forum ------------------ NES / Famicom    NESdev    NESemdev    NES Graphics    NES Music    Homebrew Projects       2017 NESdev Competition       2016 NESdev Competition       2014 NESdev Competition       2011 NESdev Competition    Newbie Help Center    NES Hardware and Flash Equipment       Reproduction    NESdev International       FCdev       NESdev China       NESdev Middle East Other    General Stuff    Membler Industries    Other Retro Dev       SNESdev       GBDev    Test Forum Site Issues    phpBB Issues    Web Issues    nesdevWiki