FW: bad rand generator: distribution/prevalences/periodicity

Discussion specific to NXT-G, NXC, NBC, RobotC, Lejos, and more.
afanofosc
Site Admin
Posts: 1256
Joined: 26 Sep 2010, 19:36
Location: Nashville, TN
Contact:

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by afanofosc »

Just put

Code: Select all

#ifdef _a_real_c_compiler
#include <cstdlib>
#define Random(_n) rand()%(_n)
#endif
at the top of your code and then use Random(n) where you would otherwise use rand()%n and enjoy a fast extremely good equal distribution of random prevalence.

How fast is your code compared to Random(16)?

John Hansen
Multi-platform LEGO MINDSTORMS programming
http://bricxcc.sourceforge.net/
HaWe
Posts: 2500
Joined: 04 Nov 2014, 19:00

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by HaWe »

I will test the speed asap!
For dev C++ rand()%(x) is no problem - so why use this macro?

But Random(_n) and rand()%(_n) seem to be different in NXC.
why?
afanofosc
Site Admin
Posts: 1256
Joined: 26 Sep 2010, 19:36
Location: Nashville, TN
Contact:

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by afanofosc »

muntoo wrote:Or maybe it was my code

(I might have done something dumb like:

Code: Select all

int rand4()
{
    return(Random() >= 4 ? 3 : Random());
}
)
I would say it was your code if that is what you used to generate a random number between 0 and 3. Seriously? You can't be serious.

John Hansen
Multi-platform LEGO MINDSTORMS programming
http://bricxcc.sourceforge.net/
HaWe
Posts: 2500
Joined: 04 Nov 2014, 19:00

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by HaWe »

I don't understand what you mean - btw, as mentioned before, it was not me who wrote this RNG, it have been probably Kerninghan and Ritchie.
I'm only using this code because it proved to generate equally distributed random numbers (even the differences between two consecutive drawn numbers are equaly distributed considering the degrees of freedom) - opposite to rand()%x .

But what about my question - why are the rand() - generated random series so awfully bad as I discovered by my test program (actually it was the fact, that the drawn numbers were so obviously un-randomly distributed that I wrote this test program to see if my suspect was right or not - and I was right...), and what is the reason that Random() is doing a better job than rand()%x ?
afanofosc
Site Admin
Posts: 1256
Joined: 26 Sep 2010, 19:36
Location: Nashville, TN
Contact:

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by afanofosc »

doc-helmut wrote:I will test the speed asap!
For dev C++ rand()%(x) is no problem - so why use this macro?

But Random(_n) and rand()%(_n) seem to be different in NXC.
why?
From what I can gather, people generally recommend not using rand()%n. See, for example, here:

http://www.c-faq.com/lib/randrange.html

John Hansen
Multi-platform LEGO MINDSTORMS programming
http://bricxcc.sourceforge.net/
HaWe
Posts: 2500
Joined: 04 Nov 2014, 19:00

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by HaWe »

well, I already read about this low-order bit stuff, but I'm not mathematician enough to understand why.
And my doubt is not easily for me to describe in English, so with a little help of my school mathematics:
If I have a big random sample survey of "real randomly drawn numbers" then all numbers n ∈ G ={0, 1, 2,...RAND-MAX } have the same chance P to be drawn
P(n)=P(m) ∀ n,m ∈ G
=> all fractions of any n devided by any x ∈ Z, x << RAND-MAX ("much less than" RAND-MAX), are equally distributed:
e.g. all odd and even numbers (nx, nx+1) ∈ G ={0, 1, 2,...RAND-MAX } are equally distributed ( <=> n%2),
all numbers (nx, nx+1, nx+2) ∈ G ={0, 1, 2,...RAND-MAX } are equally distributed( <=> n%3)
and so on.
(edited)
Only if the random sample survey was not randomly taken (e.g. by a bad RNG) then also the modulo function has no randomly distribution, too.

So have a good RNG and you will get a good rand()%x subset.

By the way, as I have been told, the Kernighan and Ritchie RNG is called a "linear congruence generator" which indeed seems to have issues
http://de.wikipedia.org/wiki/Kongruenzg ... zgenerator
http://en.wikipedia.org/wiki/Linear_con ... _generator.
Wikipedia wrote:LCGs are fast and require minimal memory (typically 32 or 64 bits) to retain state. This makes them valuable for simulating multiple independent streams.
(picture: Hyperplanes of a linear congruential generator in three dimensions)

LCGs should not be used for applications where high-quality randomness is critical. For example, it is not suitable for a Monte Carlo simulation because of the serial correlation (among other things). They should also not be used for cryptographic applications; see cryptographically secure pseudo-random number generator for more suitable generators. If a linear congruential generator is seeded with a character and then iterated once, the result is a simple classical cipher called an affine cipher; this cipher is easily broken by standard frequency analysis.

LCGs tend to exhibit some severe defects. For instance, if an LCG is used to choose points in an n-dimensional space, the points will lie on, at most, m1/n hyperplanes (Marsaglia's Theorem, developed by George Marsaglia). This is due to serial correlation between successive values of the sequence Xn. The spectral test, which is a simple test of an LCG's quality, is based on this fact.

A further problem of LCGs is that the lower-order bits of the generated sequence have a far shorter period than the sequence as a whole if m is set to a power of 2. In general, the nth least significant digit in the base b representation of the output sequence, where bk = m for some integer k, repeats with at most period bn.

Nevertheless, LCGs may be a good option. For instance, in an embedded system, the amount of memory available is often severely limited. Similarly, in an environment such as a video game console taking a small number of high-order bits of an LCG may well suffice. The low-order bits of LCGs when m is a power of 2 should never be relied on for any degree of randomness whatsoever. Indeed, simply substituting 2n for the modulus term reveals that the low order bits go through very short cycles. In particular, any full-cycle LCG when m is a power of 2 will produce alternately odd and even results.
But anyway, it obviously seems to be relatively quite better than Lego rand(). My program only should show the Lego rand() issue, it should not claim that the Kernighan and Ritchie thing was the "best RNG ever".

But: If I once will manage to write a Monte Carlo simulation I will need to have a better RNG and will surely substitute the Kernighan and Ritchie thing by this even better one.
Last edited by HaWe on 22 Jun 2011, 06:17, edited 2 times in total.
afanofosc
Site Admin
Posts: 1256
Joined: 26 Sep 2010, 19:36
Location: Nashville, TN
Contact:

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by afanofosc »

In my experience, if people generally recommend not using a particular algorithm because it is a poor algorithm I stop using it. I don't care whether I understand why it is bad or whether it is only bad in certain implementations of C libraries or hardware platforms. If you use

Code: Select all

    j = rand() / ((RAND_MAX-1)/16+1);
instead of

Code: Select all

  j = rand()%16;
you will get nearly perfect results. This suggests that the actual problem, as people generally warn, is with using modulus and rand to get a range of equally distributed random numbers. NOTE that I am using RAND_MAX-1 since at present RAND_MAX is incorrectly defined in NBCCommon.h. RAND_MAX should be 32767 rather than 32768 and the rand() macro should be Random(RAND_MAX+1) rather than Random(RAND_MAX). The range of values returned by Random is not inclusive of the max value passed into the function whereas rand() should return a value from 0 through RAND_MAX (inclusive). Also note that ((RAND_MAX-1)/16+1) is optimized at compile-time to be 2048 since there are no spaces in the expression and all the values are constants.

These changes, which I have already made to the header files, will bring NXC closer in line with other standard C libraries.

P.S., when I was quoting muntoo earlier I was asking him a question - not you. Sorry for the confusion.

John Hansen
Multi-platform LEGO MINDSTORMS programming
http://bricxcc.sourceforge.net/
HaWe
Posts: 2500
Joined: 04 Nov 2014, 19:00

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by HaWe »

sorry,
(RAND_MAX-1)/16+1) is optimized at compile-time to be 2048
?
RAND_MAX-1=32767 // wrong implementation - IMO you really should fix that!
(RAND_MAX-1)/16+1) = 32767/17= 1927.47 ~ 1927 to my opinion...(?!?)
afanofosc
Site Admin
Posts: 1256
Joined: 26 Sep 2010, 19:36
Location: Nashville, TN
Contact:

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by afanofosc »

You have an order of operation problem in your evaluation of this expression. Please don't try to fight this. Using %n is a bad idea. Divisions always come before additions unless you use parenthesis to change the OOO.

John Hansen
Multi-platform LEGO MINDSTORMS programming
http://bricxcc.sourceforge.net/
HaWe
Posts: 2500
Joined: 04 Nov 2014, 19:00

Re: Lego FW:extrem poor equal distribution of random prevalences

Post by HaWe »

:oops:
(32767/16)+1 = (int) (2047.93 +1 ) = 2048.
yes, sure!
:oops:
thx for your explanations!
Post Reply

Who is online

Users browsing this forum: No registered users and 6 guests