Random number

I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100

@axarydax said:
I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100Sweet! Obviously modulus is one of those funny mathematical concepts that doesn't really work for the big jobs.

@axarydax said:
I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
Does this particular implementation of rand() happen to return a number between 0 and 1?randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100

@PJH said:
@axarydax said:
That would be RND.I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
Does this particular implementation of rand() happen to return a number between 0 and 1?randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100

Looks like C which afaik returns a number in between 0 and 2^32  1 (or 64 on 64 bit CPUs?).

@Lingerance said:
Looks like C which afaik returns a number in between 0 and 2^32  1 (or 64 on 64 bit CPUs?).
Between 0 and RAND_MAX. Which is usualy 0x7FFF. Nowhere near the 2^321. More like 2^151.

I've seen worse.
do{
random = rand();
}
while(random < 100);

@Kain0_0 said:
@axarydax said:
I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100Sweet! Obviously modulus is one of those funny mathematical concepts that doesn't really work for the big jobs.Using modulus (or this subtraction nonsense) for this purpose is wrong: it introduces a systematic bias towards low values into the results. The correct solution is to divide by (RAND_MAX / 100).

@asuffield said:
@Kain0_0 said:
@axarydax said:
I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100Sweet! Obviously modulus is one of those funny mathematical concepts that doesn't really work for the big jobs.Using modulus (or this subtraction nonsense) for this purpose is wrong: it introduces a systematic bias towards low values into the results. The correct solution is to divide by (RAND_MAX / 100).
Sorta right reasoning, slightly wrong formula: http://cfaq.com/lib/randrange.html
This ignores the fact that in the OP, the code and the comment don't match (the code gives a number from [0,100), the comment indicates [0,100] is wanted)

@vt_mruhlin said:
I've seen worse.
do{
random = rand();
}
while(random < 100);
did you get that the wrong way about, or are you trying to get a random number that's *not* less than 100?

int getRandomNumber()
{
return 4; // chosen by fair dice roll.
// guaranteed to be random.
}

@asuffield said:
@Kain0_0 said:
@axarydax said:
I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100Sweet! Obviously modulus is one of those funny mathematical concepts that doesn't really work for the big jobs.Using modulus (or this subtraction nonsense) for this purpose is wrong: it introduces a systematic bias towards low values into the results. The correct solution is to divide by (RAND_MAX / 100).
No, that's not true, except for the trivial fact that the range of this PRNG won't include 100 in spite of the comment's spec (which can easily be fixed by changing the >= in the conditional to just a >). A good PRNG produces an equidistributed sequence. If a sequence {x_n} is equidistributed, the sequence {x_n mod y} is trivially equidistributed on [0,y).Proof: Consider the interval [ny, (n+1)y) with (n+1)y < MAX_RAND. Let z_i be a point in this interval. We can write it as z_i = ny + x_i, where x_i < y. Since the sequence is equidistributed on [ny, (n+1)y), the sequence {x_i} is equidistributed on [0, y). Note that z_i mod y = x_i.In fact, your solution is worse than using the modulus in the general case. Dividing by RAND_MAX will produce a float. There are two problems with using a float in this context. First, there's the problem of limited precision  distinct x's and y's might collide under the division. Second, you will need to round x/RAND_MAX, which is difficult to do without introducing systematic bias.

Gotta love XKCD

I think that the argument here is that PRNGs often aren't as random as we'd like, particularly in the loworder bits. As I understand it, under the right (wrong) circumstances, linear congruential generators can return low order bits that repeat with a period of 2^n where n us the number of bits you are looking for. In this case, with a max of 100, having a RNG that repeats after 2^7 requests is, let's say, suboptimal.
But that makes a lot of assumptions about how the PRNG is coded and how you're using it.
cw

@CodeWhisperer said:
I hear this problem is demonstrated in the original Quake  the nailgun's ("randomly selected") ricochet sounds repeat in a cycle of 4, IIRC; thwipthwipthwipTWANGthwipthwipthwipTWANG...I think that the argument here is that PRNGs often aren't as random as we'd like, particularly in the loworder bits. As I understand it, under the right (wrong) circumstances, linear congruential generators can return low order bits that repeat with a period of 2^n where n us the number of bits you are looking for. In this case, with a max of 100, having a RNG that repeats after 2^7 requests is, let's say, suboptimal.
But that makes a lot of assumptions about how the PRNG is coded and how you're using it.
cw

Getting [0,n]
@poopdeville said:
If a sequence {x_n} is equidistributed, the sequence {x_n mod y} is trivially equidistributed on [0,y).I don't think so. If MAX_RAND mod y isn't zero then you will get a bias toward low numbers.
Suppose MAX_RAND = 255 and y = 100. Three values of x will give x mod y = 1: 1, 101, and 201. But only two values of x will give x mod y = 99: 99 and 199. So a result of 1 will be 50% more likely than a result of 99.
The proper thing to do is first throw out high bits so that x is in the range [0,127]. Then reject any values of x > 100 and keep drawing until you get a value in [0,100].

@PJH said:
@asuffield said:
@Kain0_0 said:
@axarydax said:
I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100Sweet! Obviously modulus is one of those funny mathematical concepts that doesn't really work for the big jobs.Using modulus (or this subtraction nonsense) for this purpose is wrong: it introduces a systematic bias towards low values into the results. The correct solution is to divide by (RAND_MAX / 100).
Sorta right reasoning, slightly wrong formula: http://cfaq.com/lib/randrange.htmlThe formula on that page is independent of rounding modes. I was assuming C where the rule is roundtowardszero for integer division, in which case, RAND_MAX/N will work (IIRC). If your language uses roundtonearest or roundtoeven instead, you may need (RAND_MAX/N) + 1 (which should still work for roundtowardszero). That page probably uses the roundingindependent version because some old C implementations deviated from the spec and used the wrong rounding mode  not a big deal nowadays, they should all have gone away.

@poopdeville said:
@asuffield said:
@Kain0_0 said:
@axarydax said:
I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100Sweet! Obviously modulus is one of those funny mathematical concepts that doesn't really work for the big jobs.Using modulus (or this subtraction nonsense) for this purpose is wrong: it introduces a systematic bias towards low values into the results. The correct solution is to divide by (RAND_MAX / 100).
No, that's not true, except for the trivial fact that the range of this PRNG won't include 100 in spite of the comment's spec (which can easily be fixed by changing the >= in the conditional to just a >). A good PRNG produces an equidistributed sequence. If a sequence {x_n} is equidistributed, the sequence {x_n mod y} is trivially equidistributed on [0,y).Proof: Consider the interval [ny, (n+1)y) with (n+1)y < MAX_RAND. Let z_i be a point in this interval. We can write it as z_i = ny + x_i, where x_i < y. Since the sequence is equidistributed on [ny, (n+1)y), the sequence {x_i} is equidistributed on [0, y). Note that z_i mod y = x_i.In fact, your solution is worse than using the modulus in the general case. Dividing by RAND_MAX will produce a float. There are two problems with using a float in this context. First, there's the problem of limited precision  distinct x's and y's might collide under the division. Second, you will need to round x/RAND_MAX, which is difficult to do without introducing systematic bias.Let's suppose MAX_RAND is 256. So you map 0..256 to 0..99, 0..99 and 0..56,  i.e. 0..56 have a 0.01171875 each, 57..99 have a 0.0078125 chance each. 0..49 has a 58.6% probability and 50..99 is 41.4%.
If you divide, you'll have the same problem but it'll be more like 0 and 1 have the .007 chance, 2 has the .011 chance, 3, 4, have the .007 chance, 5 has .011, 6 has .007, 7 has .011, etc. There's no way to get it perfect, but at least this way you're not consistently more likely to roll low than high (00..49 and 50..99 are going to have as close to exactly 50% chance as you can get it.)
Obviously if MAX_RAND is close to a multiple of 100, it's less of a problem, and the bigger it is the less of a problem it is anyway.

@poopdeville said:
Proof: Consider the interval [ny, (n+1)y) with (n+1)y < MAX_RAND. Let z_i be a point in this interval. We can write it as z_i = ny + x_i, where x_i < y. Since the sequence is equidistributed on [ny, (n+1)y), the sequence {x_i} is equidistributed on [0, y). Note that z_i mod y = x_i.In order to prove your claim, you would need to show that for all p, q, y, the interval [p, q) is mapped evenly onto [0, y) by modulus y. What you have shown here is that for all y, there exists p, q such that the interval [p, q) that can be mapped evenly onto [0, y) by modulus y.Your claim was that any PRNG can generate any range correctly after modulus, and here's a disproof by showing that at least one exists which does not:Let RAND_MAX be 4, and target be the range 0..3.The PRNG generates values from the set {0, 1, 2, 3, 4}. The PRNG generates evenly distributed values, so for each value, the probability of it occurring is 1/5.Computing those values mod 4 gives us {0, 1, 2, 3, 0}. The probability of each value occurring here is:P(0) == 1/5 + 1/5 == 2/5P(1) == 1/5P(2) == 1/5P(3) == 1/5It is immediately apparent that this is not evenly distributed.In fact, your solution is worse than using the modulus in the general case. Dividing by RAND_MAX will produce a float.No it doesn't. That is the C integer division operator. It produces an integer.Second, you will need to round x/RAND_MAX, which is difficult to do without introducing systematic bias.Ironically enough, the equation I gave is the 'difficult' solution for rounding without systematic bias under C's rounding rules; somebody else gave the variant for doing it under any rounding rules. They work by placing two divide operations together such that their errors cancel rather than reinforce. Proving that they are correct solutions is quite difficult (you have to analyse the possible errors in the two divisions and determine that they do cancel each other out to within the precision of integers).

@Irrelevant said:
I hear this problem is demonstrated in the original Quake  the nailgun's ("randomly selected") ricochet sounds repeat in a cycle of 4, IIRC; thwipthwipthwipTWANGthwipthwipthwipTWANG...
Heh, I knew that sounded somewhat familiar, so I did a Google search...and found a hit on the WTF forums. (Definitely explains where I'd read about it before...) Not quite as you describe, but the way it actually was implemented makes even less sense.

@axarydax said:
I was browsing company's CAD/CAM software source code and I found this ellegant way to get random number in desired range :
randnum=rand();
while(randnum>=100) randnum=100; //need number from 0 to 100
while(randnum>100) randnum=100; //need number from 0 to 100or
while(randnum>=100) randnum=100; //need number from 0 to 99Fixed...