Quality of the CC65 randomizer

Discuss technical or other issues relating to programming the Nintendo Entertainment System, Famicom, or compatible systems. See the NESdev wiki for more information.

Moderator: Moderators

tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Quality of the CC65 randomizer

Post by tepples »

DRW wrote:
tepples wrote:In current rand.s, X contains byte 2 AND $7F, A contains byte 1, and byte 3 (the high byte) never affects the state. A corrected version might have X contain byte 3 AND $7F and A contain byte 2, so that the resulting sequence is less predictable. Or if you're going to rename it, you can just load byte 3 into A and zero out X.
Aren't the four bytes of the seed just there to calculate the next seed?
I mean, the seed is four bytes while the return value is two bytes. So, two of them go unused anyway when it comes to the return value.
Or do you mean that byte 3 not even influences the calculation of the next seed and is totally useless?
Correct, in the version of rand.s at the HEAD on GitHub. Byte 0 feeds into future values of bytes 1 and 2, but byte 3 doesn't feed into anything. A corrected version of this subroutine would use byte 3 in some capacity to form the output.
DRW wrote:
rainwarrior wrote:rand+1 is of lower quality than rand+3.
But doesn't that mean that, when I use the original rand() function and I only need a value from 0 to 255 and I use it like this:
rand() & 0xFF
that this would be bad since I only use the not so good bits?
Correct.
Why does the corrected version in this thread look so totally different with an additional variable called rand_addend and the value 4282663? Is this all necessary or is there a minimum fix that changes the randomizer back to 32 bit, but with only small changes in the code?
In theory, each LCG is characterized by three numbers: the modulus (in this case 2^31 or 2^32 for the 31/32-bit version or 2^23 or 2^24 for the 23/24 bit version), the multiplier (in this case 0x01010101 = 16843009 for the 31/32-bit version or 0x010101 = 65793 for the 24-bit version), and the addend (in this case 0x31415927 = 826366247 for the 31/32 bit version or 0x415927 = 4282663 for the 24-bit version). As I understand it, the standard form of specifying an LCG is with the multiplier and addend expressed in base 10 (decimal) rather than base 16 (hexadecimal).

rainwarrior: DRW is referring to my suggested optimized version, not the corrected version. The optimized version has the same behavior as the current rand.s but doesn't calculate the otherwise unused byte 3 at all.
User avatar
drnick
Posts: 1
Joined: Mon Jul 11, 2016 9:56 pm
Contact:

Re: Quality of the CC65 randomizer

Post by drnick »

I have slightly modified the earlier Python script link

It seems the standard deviation of the run length from CC65's routine is significantly different from that of Python's random.randrange for choosing n<2. Namely, it is ~0.95 for the former and ~1.4 for the latter. This doesn't seem to hold for choosing n<N where N is >2.

This could be potentially relevant as the length of the runs will be slightly less varied.
Last edited by drnick on Sun Nov 24, 2019 4:00 pm, edited 1 time in total.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Quality of the CC65 randomizer

Post by rainwarrior »

This is the LCG I proposed by the minimal removal of pha/pla:

Code: Select all

return ((seed >> 8) & 0x7F00) | (seed >> 24)
Since the flow of information in the LCG only goes from low bits to higher bits, and never the other way, if you're looking at just the lowest bit from the original rand.s, it's really only a 9-bit LFSR. So... I think that would explain a low deviation. There's only so much variation you can get from a sequence length of only 512.


...actually realizing this, that means that probably the most commonly used bits in rand(), e.g. things like "rand() & 1" or "rand() & 3" are actually relatively short sequences (only a bit or two better than an 8-bit PRNG). With that in mind, the existing form of rand.s is actually somewhat poor, despite giving passable results in other tests. :S The proposed revision does not suffer from this problem at all though.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Quality of the CC65 randomizer

Post by rainwarrior »

tepples wrote:As I understand it, the standard form of specifying an LCG is with the multiplier and addend expressed in base 10 (decimal) rather than base 16 (hexadecimal).
There's nothing particularly meaningful about base 10 here. It offers no insight into the (inherently binary) implementation, and neither number has 2 or 5 as a factor, so base 10 has nothing at all to contribute to their understanding.

Base 16 on the other hand gives me 2 numbers that I can take a glance at and know exactly why they exist and what they're going to be doing in this program. The multiplier's design around 8-bit structure is obvious, and the adder is a whimsically chosen "base 10 pi digits expressed in base 16, rounded up so that it's odd". There's a huge comprehension advantage with hexadecimal here.

Converting them back to base 10 hides all of those things, obfuscating them into essentially "random" looking numbers. I can see why DRW was confused by them.
User avatar
DRW
Posts: 2225
Joined: Sat Sep 07, 2013 2:59 pm

Re: Quality of the CC65 randomizer

Post by DRW »

rainwarrior wrote:
DRW wrote:Or do you mean that byte 3 not even influences the calculation of the next seed and is totally useless?
Bingo.
But in this case, why is it a good idea to remove PLA and PHA to get rand+3 as the return value if this is some value that never really influences anything (at least in the unaltered version)? Wouldn't it be a bad idea to use it since it basically stands for itself and only influences itself?
My game "City Trouble":
Gameplay video: https://youtu.be/Eee0yurkIW4
Download (ROM, manual, artworks): http://www.denny-r-walter.de/city.html
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Quality of the CC65 randomizer

Post by rainwarrior »

DRW wrote:
rainwarrior wrote:
DRW wrote:Or do you mean that byte 3 not even influences the calculation of the next seed and is totally useless?
Bingo.
But in this case, why is it a good idea to remove PLA and PHA to get rand+3 as the return value if this is some value that never really influences anything (at least in the unaltered version)? Wouldn't it be a bad idea to use it since it basically stands for itself and only influences itself?
No. You're looking for a random number. More influences = more sources of randomness. The highest bit is influenced by every other bit below it. That's the one you want to use most. The most influenced is best, not the most influencing.

The bit that influences the most is bit 0 of rand+0. Watch it iterate as it flips back and forth with each call. 0->1->0->1->0->1->0->1... never deviating. It would be the worst thing to use for randomness.
User avatar
DRW
Posts: 2225
Joined: Sat Sep 07, 2013 2:59 pm

Re: Quality of the CC65 randomizer

Post by DRW »

O.k.

When I'm home I'll rewrite my copy of the random function to include the optimizations and include all the stuff that I need. Then I'll post the final version.

One last thing:
rainwarrior wrote:
DRW wrote:Why does the corrected version in this thread look so totally different with an additional variable called rand_addend and the value 4282663? Is this all necessary or is there a minimum fix that changes the randomizer back to 32 bit, but with only small changes in the code?
WTF are you talking about? The corrected version is literally cut and paste from rand.s on github, and the difference is 4 semicolons.
tepples wrote:rainwarrior: DRW is referring to my suggested optimized version, not the corrected version. The optimized version has the same behavior as the current rand.s but doesn't calculate the otherwise unused byte 3 at all.
Is your version (i.e. the one that was only slightly changed, not tepples' that has some more changes) a 32 bit LCG or a 24 bit LGC now?
If it's still the 24 bit one, what is the minimum change that I have to do to convert this to a 32 bit LGC?
My game "City Trouble":
Gameplay video: https://youtu.be/Eee0yurkIW4
Download (ROM, manual, artworks): http://www.denny-r-walter.de/city.html
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Quality of the CC65 randomizer

Post by rainwarrior »

Yeah, just delete pha/pla and it's golden.
User avatar
DRW
Posts: 2225
Joined: Sat Sep 07, 2013 2:59 pm

Re: Quality of the CC65 randomizer

Post by DRW »

O.k., the following code is the randomizer, altered according to your suggestions as well as my specific needs.

Can you please check if everything is still correct?

The randomizer is supposed to be 32 bit, but the return value is only an 8 bit value.

These are the function prototypes in C:

Code: Select all

void __fastcall__ InitializeRandomizer(unsigned int seed);
unsigned char __fastcall__ GetRandomNumber(unsigned char bitMask);
The bit mask is there to shorten the return value. The value will get AND-connected with the bit mask (of course, only the temporary value in A gets AND-connected, not rand+3 itself), so that I can specify that I want only a value from 0 to 3 or from 0 to 15 etc.

Since I don't have a DATA segment in my game, the whole "If you don't call srand before rand, then rand has to behave as if srand was called with the value 1" is ignored. In my game, I make sure that no random number gets created before the seed was initialized. (Would there even be a problem in this implementation if you call the randomizer if the seed is 0?)

Code: Select all

.zeropage

        rand:    .res 4
        bitMask: .res 1

.code

_GetRandomNumber:

        sta     bitMask

        clc
        lda     rand+0
        adc     rand+1
        sta     rand+1
        adc     rand+2
        sta     rand+2
        adc     rand+3
        sta     rand+3

        clc
        lda     rand+0
        adc     #$27
        sta     rand+0
        lda     rand+1
        adc     #$59
        sta     rand+1
        lda     rand+2
        adc     #$41
        sta     rand+2
        lda     rand+3
        adc     #$31
        sta     rand+3

        and     bitMask
        ldx     #0
        rts


_InitializeRandomizer:

        sta     rand+0
        stx     rand+1
        lda     #0
        sta     rand+2
        sta     rand+3
        rts
X was set to 0, so that the return value doesn't differ between those two calls:

Code: Select all

unsigned char c = GetRandomNumber(0xFF);
unsigned int i = GetRandomNumber(0xFF);
(As far as I know, in the second call, X would become the high byte of the integer, so it should be set to 0 and not to some other value.)
My game "City Trouble":
Gameplay video: https://youtu.be/Eee0yurkIW4
Download (ROM, manual, artworks): http://www.denny-r-walter.de/city.html
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Quality of the CC65 randomizer

Post by rainwarrior »

DRW wrote:The randomizer is supposed to be 32 bit, but the return value is only an 8 bit value.
The "bits" of a PRNG don't have to do with the return value, directly. A 32-bit PRNG generates a repeating sequence of at most 2^32 values.

In reality with this LCG, each bit has its own period of randomness. i.e. rand() & 1 is a 24-bit sequence, rand() & 2 is a 25-bit, etc. Technically the "best" bit is rand() & 128.
DRW wrote:Would there even be a problem in this implementation if you call the randomizer if the seed is 0?
No, with the LCG there's no problem if the seed is 0. That's only a problem for LFSRs.
DRW wrote:Can you please check if everything is still correct?
Looks okay to me.
tepples
Posts: 22708
Joined: Sun Sep 19, 2004 11:12 pm
Location: NE Indiana, USA (NTSC)
Contact:

Re: Quality of the CC65 randomizer

Post by tepples »

rainwarrior wrote:
DRW wrote:Would there even be a problem in this implementation if you call the randomizer if the seed is 0?
No, with the LCG there's no problem if the seed is 0. That's only a problem for LFSRs.
Or (technically) with Lehmer LCGs, which use a prime period and a zero addend instead of a power-of-two period and an odd addend. But it won't be a problem for this particular LCG.
User avatar
DRW
Posts: 2225
Joined: Sat Sep 07, 2013 2:59 pm

Re: Quality of the CC65 randomizer

Post by DRW »

Thanks for the check.
rainwarrior wrote:The "bits" of a PRNG don't have to do with the return value, directly. A 32-bit PRNG generates a repeating sequence of at most 2^32 values.
Yes, I know. The bits indicate how many times you can generate a random number until the whole sequence starts again.
I was just saying this because people might get the idea that a randomizer that returns an 8 bit value also has an 8 bit seed while a 16 bit randomizer has a 16 bit seed etc., which is of course not the case here.
rainwarrior wrote:In reality with this LCG, each bit has its own period of randomness. i.e. rand() & 1 is a 24-bit sequence, rand() & 2 is a 25-bit, etc. Technically the "best" bit is rand() & 128.
So, does that mean that the bits of the copy of rand+3 in A should be inverted before being AND-connected with the bit mask? Because the bit mask parameter is always %00000001, %00000011, %00000111 etc. and inverting A lets the return value become more random since the high bits are more randomized and inverting them turns the highest bits into the lowest bits?
My game "City Trouble":
Gameplay video: https://youtu.be/Eee0yurkIW4
Download (ROM, manual, artworks): http://www.denny-r-walter.de/city.html
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Quality of the CC65 randomizer

Post by rainwarrior »

DRW wrote:
rainwarrior wrote:In reality with this LCG, each bit has its own period of randomness. i.e. rand() & 1 is a 24-bit sequence, rand() & 2 is a 25-bit, etc. Technically the "best" bit is rand() & 128.
So, does that mean that the bits of the copy of rand+3 in A should be inverted before being AND-connected with the bit mask? Because the bit mask parameter is always %00000001, %00000011, %00000111 etc. and inverting A lets the return value become more random since the high bits are more randomized and inverting them turns the highest bits into the lowest bits?
Not really. You're already getting a 24-bit sequence with the lowest bit (>16 million!), I don't think you need to worry about not getting "the best" once it's good enough.

If you really wanted to increase quality, you could just add another byte (rand+4). I don't think there's much point; 16+ bits is probably good enough for almost any NES game's needs.
User avatar
rainwarrior
Posts: 8732
Joined: Sun Jan 22, 2012 12:03 pm
Location: Canada
Contact:

Re: Quality of the CC65 randomizer

Post by rainwarrior »

The main branch of cc65's rand.s has now been fixed. (change)
Post Reply