I don't think a reroll is that bad a solution for many cases, and is usually very easy to implement. Each iteration the probability that it will keep looping gets exponentially smaller. In practice these don't tend to loop very many times.
Some alternatives:
Omegamatrix once wrote a bunch of division routines for various constant values, which may potentially be useful in a situation like this:
https://forums.nesdev.com/viewtopic.php?f=2&t=11336Also, you can trade precision/distribution-bias for speed or table size...
Code:
; modulo by subtraction loop
jsr prng ; 8-bit result in A
and #(8-1) ; this could by 8, 16, 32, 64...
sec
:
tax
;sec ; unnecessary
sbc #RANGE
bcs :-
; result now in X
; modulo by table
jsr prng
and #(8-1)
tax
lda mod8by5, X
; result now in A
In the subtraction loop, you can control speed by decided how many times it could loop with the size of your AND. Each extra loop reduces the bias. (If space was tight, this routine might be made generic by using variables instead of immediates for the AND and SBC too.)
In the table lookup, the speed is constant, but similarly the size of the AND determines the size of the table and the sizes of the bias.
E.g. is a 1/8 probability bias for some values okay? 1/16? 1/32?