It is currently Thu Dec 14, 2017 6:23 am

All times are UTC - 7 hours





Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next
Author Message
PostPosted: Wed May 24, 2017 8:05 pm 
Offline
User avatar

Joined: Tue Apr 04, 2017 1:22 pm
Posts: 19
Location: Ohio, USA
For my latest project, I needed a way to generate random numbers on the NES. I figured I'd share my method here.
I use a 32-bit Galois Linear Feedback Shift Register (LFSR: https://en.wikipedia.org/wiki/Linear-feedback_shift_register#Galois_LFSRs), which is decently quick and can generate a sequence of pretty unpredictable values.

The code is written for NESASM3, but can be easily rewritten for any assembler.
It takes <30 cycles per output bit (assuming the lfsr variable is zero page), and has an internal period of 2^32 - 1 values. Just remember to seed the lfsr variable with a non-zero value.

Code:
lfsr   .rs 4

; Pass the number of bits to output in X
; The result is returned in the lower X bits of A
RollRandom:
   LDA <lfsr
.roll
   LSR A
   ROR <lfsr+1
   ROR <lfsr+2
   ROR <lfsr+3
   BCC .done
   EOR #$A3
.done:
   DEX
   BNE .roll
   STA <lfsr
   RTS

_________________
http://zutanogames.com/ <-- my dev blog


Top
 Profile  
 
PostPosted: Wed May 24, 2017 8:23 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5896
Location: Canada
Maybe not surprising because it's the same algorithm, but it's nearly identical to the 32-bit LFSR PRNG example I wrote for the wiki a while ago:
https://wiki.nesdev.com/w/index.php/Random_number_generator/Linear_feedback_shift_register_(advanced)

The base article is at:
https://wiki.nesdev.com/w/index.php/Random_number_generator

There are some other interesting approaches to PRNG generators in the external links


Top
 Profile  
 
PostPosted: Wed May 24, 2017 8:40 pm 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 3968
Dragon Warrior 2 & 3 used a LFSR (linear feedback shift register) for the RNG. They also happened to use the same bytes of memory to store the saved game checksum. Some save checksums will freeze the RNG.

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
PostPosted: Wed May 24, 2017 8:55 pm 
Offline

Joined: Mon Nov 10, 2008 3:09 pm
Posts: 431
Dwedit wrote:
Dragon Warrior 2 & 3 used a LFSR (linear feedback shift register) for the RNG. They also happened to use the same bytes of memory to store the saved game checksum. Some save checksums will freeze the RNG.


It's more accurate to say that they use the same code both as a PRNG and as a CRC16 calculator. When using it as a PRNG, they feed it a constant $FFFF in place of the data word to be CRCed.

Dragon Quest/Warrior 4 and 5 (on the SNES) use the same code, but they constantly twiddle the CRC16 state when the CPU is otherwise idle to avoid the vulnerability.

Dragon Quest 6 uses a 32-bit LFSR and has a different vulnerability: both the main thread and the NMI thread generate random numbers, and if the NMI interrupts the main thread at just the right time in the middle of the LFSR routine it can knock the LFSR into a frozen state. The odds of this happening are extremely low, but if you play the game as obsessively as many people play their JRPGs you will eventually see it happen (folk wisdom seems to be that it's caused by the cartridge "overheating" due to playing too long without a break). This vulnerability was also patched in the next game in the series, the SNES version of DQ3 (they patched it by adding a mutex/semaphore/whateveryouwannacallit so that the NMI thread will just read the current state and not try to shift the LFSR if it's in the middle of being shifted)


Top
 Profile  
 
PostPosted: Wed May 24, 2017 9:19 pm 
Offline

Joined: Sun Sep 19, 2004 11:12 pm
Posts: 19342
Location: NE Indiana, USA (NTSC)
Lately I've been a fan of the linear congruential generator in cc65's standard library (seed = seed * 0x01010101 + 0x31415927) now that it's been fixed.


Top
 Profile  
 
PostPosted: Wed May 24, 2017 9:34 pm 
Offline
User avatar

Joined: Thu Mar 31, 2016 11:15 am
Posts: 219
Maybe a KISS RNG would be worth using for high quality numbers. Some variants don't use multiplication.

Code:
/* Implementation of a 32-bit KISS generator which uses no multiply instructions */
static unsigned int x=123456789,y=234567891,z=345678912,w=456789123,c=0;
unsigned int JKISS32()
{
 int t;
 y ^= (y<<5); y ^= (y>>7); y ^= (y<<22);
 t = z+w+c; z = w; c = t < 0; w = t&2147483647;
 x += 1411392427;
 return x + y + w;
}


Top
 Profile  
 
PostPosted: Wed May 24, 2017 9:43 pm 
Offline

Joined: Mon Nov 10, 2008 3:09 pm
Posts: 431
From that thread:

Quote:
why you don't usually want to use an LCG for generating X,Y pairs


The Dragon Ball/Dragon Ball Z RPGs programmed by TOSE on the NES use an 8-bit LCG. Which they cripple further by always using the low-order bits, the least random ones in any LCG. And the main use of those random numbers is to generate X,Y,Z tuples--the attack power, defence power and "suit" of the cards that movement and battle are based on. Needless to say, the cards you get aren't very random at all.

The Game Boy Dragon Quest Monsters games (also programmed by TOSE) also do terrible things with their RNGs, but I forget most of the details. IIRC at least one of them just loops through a static sequence of 128 numbers stored in the ROM (even FF1 used 256).


Top
 Profile  
 
PostPosted: Wed May 24, 2017 9:44 pm 
Offline

Joined: Mon Nov 10, 2008 3:09 pm
Posts: 431
pubby wrote:
Maybe a KISS RNG would be worth using for high quality numbers. Some variants don't use multiplication.

Code:
/* Implementation of a 32-bit KISS generator which uses no multiply instructions */
static unsigned int x=123456789,y=234567891,z=345678912,w=456789123,c=0;
unsigned int JKISS32()
{
 int t;
 y ^= (y<<5); y ^= (y>>7); y ^= (y<<22);
 t = z+w+c; z = w; c = t < 0; w = t&2147483647;
 x += 1411392427;
 return x + y + w;
}


Bit shifts by anything other than 1 or 8 are very expensive on the 6502. You might as well be multiplying as doing all those shifts.

The KISS family of generators essentially consists of combinations of the output of several generators whose periods have a large lowest-common-multiple. On the 6502, you could XOR or add the output of a LFSR (period 2^bits - 1) and the output of a LCG (period 2^bits) to achieve such a result. In fact that's almost what the Dragon Quest games do; they add the output of an 8-bit counter (period 256) to the output of the CRC16 (period 65536 - 2).


Top
 Profile  
 
PostPosted: Wed May 24, 2017 10:10 pm 
Offline
User avatar

Joined: Fri Nov 19, 2004 7:35 pm
Posts: 3968
I wrote this down in the Goomba Color source code somewhere:
DW3 GBC:
seed = ((seed << 17) >> 15) + ((seed & 0xFF000000) << 1) + ((seed & 0xFF0000) + ((seed << 1) & 0xFF0000));

_________________
Here come the fortune cookies! Here come the fortune cookies! They're wearing paper hats!


Top
 Profile  
 
PostPosted: Wed May 24, 2017 10:19 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5896
Location: Canada
pubby wrote:
Maybe a KISS RNG would be worth using for high quality numbers. Some variants don't use multiplication.

I think that PRNG is written for a system that has 32 bit words, lacks an efficient multiply instruction, but has an efficient multi-shift. Maybe some earlier 32-bit x86? I'm not sure.

Do you have a source for that code? When was it written? Why is it called "KISS"... it doesn't seem simple to me.)

On the 6502 though, all those 32-bit adds, and especially those 32-bit multi-shifts are really going to hurt!

The CC65 PRNG tepples linked is a simple multiply-and-add LCG at heart, but its multiply is very efficient (via a carefully chosen multiplicand), there's no need to avoid multiplication here for its own sake.


Top
 Profile  
 
PostPosted: Wed May 24, 2017 10:45 pm 
Offline

Joined: Mon Nov 10, 2008 3:09 pm
Posts: 431
In an action game where cycles are precious, an advantage of a LFSR over a LCG is that all the bits in a LFSR are equally random. If you only ever need one or two bits at a time, you can get many random numbers out of each clocking of a LFSR.

Super Mario Bros. and Kung Fu Heroes use a 64-bit LFSR which is clocked once per frame, and every piece of code that needs random numbers (e.g. every enemy type that has a random behaviour) samples a different bit or range of bits from the LFSR. If two of the same enemy are in the same RNG-tapping state on the same frame then they'll always both choose the same behaviour, but that doesn't happen often enough to be particularly noticeable to the player.

(Interestingly, the two games use quite different code, but functionally both are the same LFSR with the same taps)


Top
 Profile  
 
PostPosted: Wed May 24, 2017 10:54 pm 
Offline
Formerly 43110
User avatar

Joined: Wed Feb 05, 2014 7:01 am
Posts: 313
Location: us-east
In regards to the original post:

Interestingly I also used the $a3/$c5 polynomial for my RNG a couple years back. That post also has a 8 bit shifted version as well.

Currently I use the right shifted version, so that the least significant byte is automatically in accumulator, just like in the wiki page rainwarrior linked.

This is my 8 bit shifted routine of that:
Code:
rng_step_byte:
  lda rng_seed+3        ; Dealing with the low byte.
  lsr                   ; start with x^30 term.
  lsr
  lsr
  lsr
  eor rng_seed+3        ; x^26 term
  lsr
  eor rng_seed+3        ; x^25 term
  lsr
  eor rng_seed+0        ; XOR with original low byte.
  sta rng_seed+0
  lda rng_seed+3        ; Again for the high byte.
  asl                   ; Start with x^25 term
  eor rng_seed+3        ; x^26 term
  asl
  asl
  asl
  asl
  eor rng_seed+3        ; x^30 term
  asl
  asl
  eor rng_seed+3        ; x^32 term (or original high byte)
  ldy rng_seed+2        ; finaly, rotate the bytes.
  sty rng_seed+3
  ldy rng_seed+1
  sty rng_seed+2
  ldy rng_seed+0
  sty rng_seed+1
  sta rng_seed+0
rts


Last edited by JRoatch on Wed May 24, 2017 11:07 pm, edited 2 times in total.

Top
 Profile  
 
PostPosted: Wed May 24, 2017 11:02 pm 
Offline
User avatar

Joined: Thu Mar 31, 2016 11:15 am
Posts: 219
rainwarrior wrote:
Do you have a source for that code? When was it written? Why is it called "KISS"... it doesn't seem simple to me.)

On the 6502 though, all those 32-bit adds, and especially those 32-bit multi-shifts are really going to hurt!

Source: http://www0.cs.ucl.ac.uk/staff/d.jones/ ... iceRNG.pdf
Wikipedia: https://en.wikipedia.org/wiki/KISS_(algorithm)

It's a combination of LCG and LFSR engines to improve the randomness/period. I don't understand it beyond that.


Top
 Profile  
 
PostPosted: Wed May 24, 2017 11:27 pm 
Offline
User avatar

Joined: Sun Jan 22, 2012 12:03 pm
Posts: 5896
Location: Canada
AWJ wrote:
In an action game where cycles are precious, an advantage of a LFSR over a LCG is that all the bits in a LFSR are equally random. If you only ever need one or two bits at a time, you can get many random numbers out of each clocking of a LFSR.
...
clocked once per frame

While individual bits of an LFSR are equally random, they're not very independent of each other. If you sampled & 1 for one enemy and & 2 for another enemy, you may see correlated behaviours on successive frames as a "1" bit passes through one then the other. A lot of LFSR implementations are going to do the XOR only on one byte too, so the upper bytes become very correlated.

Similarly, if you're doing & 3 (two bits) you should clock it twice, or if & 7 (three bits) you should clock it three times. Failing to do so leads to very strong *2, *4, etc. correlations on successive tests.

For complete entropy you should clock it as many times as you read bits, but its somewhat asymptotic, 2 clocks is a lot better than 1 clock, but 7 clocks isn't that much better than 6. You can save time by doing less, if the quality isn't critical. (Also, there's no reason to clock it more times than bits read. It's as random as it's going to get at that point.)

Similarly, if you don't do PRNG tests every frame and try to vary their timing (and still clock it every frame), it breaks up a lot of the correlations that would otherwise appear. That can help too. If it's been 8 frames since you last did a PRNG test but you clocked it every frame, you've got a nice quality 8-bit number ready to use.

AWJ wrote:
If two of the same enemy are in the same RNG-tapping state on the same frame then they'll always both choose the same behaviour

I've seen that in several games, and I've always thought of it as a serious flaw, not something people hardly notice. :P I've often been annoyed at how unintentionally predictable RPG battles can become.

Anyhow, in a lot of cases, very poor quality PRNGs will seem sufficient. Clocking an LSFR once per frame may indeed be good enough sometimes, but it's often not immediately obvious how a poorly used PRNG is hurting things. Sometimes it takes a while to discover that in testing, but once discovered it becomes an easy to use exploit. Sometimes you don't realize why something plays out kinda crummy, and it's really just from poor PRNG use, but you can't tell because you're not comparing against a good PRNG.


I do like the LFSR generators, but my recommendation is to clock it once for every bit you need to read from it. Don't "optimize" before you begin by starting from the poorest entropy. Start with the good stuff, and optimize it only IF you need the extra cycles. Usually games do not need to make PRNG tests very frequently, and in such cases you'll gain almost nothing by using a poorer PRNG but there's a lot to lose. It's very likely efficient enough just to do it properly! You don't have to use some super slow cryptographically secure PRNG, there are plenty of simple ones that are reasonably fast and good quality.

In Zutano's example (OP) the entropy is controlled directly by the input parameter in X. If reading 3 bits, call it with X=3 and use the result & 7. That's basically perfect usage. Only use the lowest bits, and don't "share" the LFSR. Don't think about doing less work until PRNGs are your performance problem.


Top
 Profile  
 
PostPosted: Thu May 25, 2017 12:08 am 
Offline
User avatar

Joined: Mon Feb 07, 2011 12:46 pm
Posts: 941
A few additional sources of entropy which may be available (but cannot be relied on, and may not be suitable with all algorithms) can be the microphone input and uninitialized RAM.

Assuming the algorithm produces uniform output, what you can do if you need a uniform range that isn't in a power of 2: Discard all bits that are too high (using AND), and then if the number is too high, try again until it isn't too high.

You can execute the random number generator a lot during the title screen, and maybe even during play if possible, even if discarding the results in that case, in order to not get the same numbers each time. (In the cases where you do want same numbers each time, you can avoid this things.)

Another algorithm (I have used) is ARCFOUR. However, it requires 258 bytes of RAM.

_________________
.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 23 posts ]  Go to page 1, 2  Next

All times are UTC - 7 hours


Who is online

Users browsing this forum: Yahoo [Bot] and 5 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group