Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you may not be able to execute some actions.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

V8's implementation of Math.random() was fucked

J
jmp last edited by

There’s Math.random(), and then there’s Math.random() · V8

TL;DR: Google's JavaScript engine used an old, not-particularly-good PRNG to implement Math.random(), and they used an old version of it that was known-buggy when they integrated it. The result was random output that looks like this:
http://1.bp.blogspot.com/-92e22bUMwwk/VnKQn2VT3EI/AAAAAAAAB-E/8XD12xuGTXc/s1600/Untitled%2Bdrawing.png
Reply Quote 11
1 Reply Last reply
FrostCat Discourse touched me in a no-no place last edited by

I remember bolting the Mersenne Twister into a roguelike about 15 years ago for basically that reason.
Reply Quote 3
1 Reply Last reply
blakeyrat last edited by

Man that explains why that WebGL shader I wrote ages ago couldn't produce any decent noise.
Reply Quote 4
N 1 Reply Last reply
A
anonymous234 last edited by

With the Mersenne Twister being so good, well tested and widely available, why would anyone bother using anything else?

Heck, V8 is written in C++, why not just call rand() and let the standard library do it for you?
(edit: yes, yes, I get it, rand() sucks)

Well, at least it's not as bad as Java's "SecureRandom".
Reply Quote 2
L J 8 Replies Last reply
L
LB_ last edited by @anonymous234

@anonymous234 said:
why not just call rand() and let the standard library do it for you?

Because rand() is notoriously bad (far worse than the broken algorithm that was in use) and is going to be deprecated soon if it isn't already. Instead, we now use the random number generators which were introduced in C++11:

Standard library header (C++11) - cppreference.com
Reply Quote 1
1 Reply Last reply
J
jmp last edited by @anonymous234

Mersenne Twister is relatively slow compared to other PRNG implementations, and it has a relatively large amount of state (~2 kilobytes).

But yes, it would have been a way better decision.

(in addition to rand() often being famously terrible like LB_ pointed out, IIRC the algorithm isn't specified, and the V8 developers probably don't want to rely on the standard library they end up linking with having a sane implementation)
Reply Quote 1
1 Reply Last reply
Deadfast last edited by @jmp

@jmp said:
V8 developers probably don't want to rely on the standard library they end up linking with having a sane implementation

So they decided to be sure and wrote their own insane implementation.
Reply Quote 17
1 Reply Last reply
J
jmp last edited by @anonymous234

Oh, also RAND_MAX is only guaranteed to be >= 32767, whereas JS rand() is required to return values up to 2**32. Expanding the former to the latter could result in some distribution and periodicity issues.
Reply Quote 5
1 Reply Last reply
FrostCat Discourse touched me in a no-no place last edited by @anonymous234

@anonymous234 said:
With the Mersenne Twister being so good, well tested and widely available, why would anyone bother using anything else?

Generally because you're talking about software written before MT existed, or that was written by people who didn't know about it, or because it wasn't necessarily a good fit for the application.

@anonymous234 said:
Heck, V8 is written in C++, why not just call rand() and let the standard library do it for you?

It's well-known that lots of C and C++ implementations historically have had bad PRNGs behind rand() et all.
Reply Quote 1
1 Reply Last reply
FrostCat Discourse touched me in a no-no place last edited by @jmp

@jmp said:
RAND_MAX is only guaranteed to be >= 32767

Galactic Bloodshed was written with the expectation of a 48-bit RNG being available and (for example) universe generation failed horribly with a 16-bit RNG.
Reply Quote 0
1 Reply Last reply
rc4 last edited by @anonymous234

Because it's slow. I would recommend an LFSR or xorshift.
Reply Quote 0
1 Reply Last reply
ben_lubar last edited by @anonymous234

@anonymous234 said:
Heck, V8 is written in C++

@anonymous234 said:
standard library

http://www.cplusplus.com/reference/random/mersenne_twister_engine
Reply Quote 1
1 Reply Last reply
JBert Fake News last edited by @ben_lubar

Hanzo'd by LB_, it got only introduced in C++11.
Reply Quote 1
2 Replies Last reply
ben_lubar last edited by @JBert

C++0x is ancient. It's like a quarter of my age. That's pretty old for software.
Reply Quote 2
1 Reply Last reply
cvi last edited by @JBert

@JBert said:
Hanzo'd by LB_, it got only introduced in C++11.

The RNGs were already included in the TR1.

@ben_lubar said:
~~C++0x~~TR1 is ancient. It's like a ~~quarter~~half of my age.

FTFY.
Reply Quote 1
1 Reply Last reply
blakeyrat last edited by @FrostCat

@FrostCat said:
Galactic Bloodshed

Yes and obviously this site (game?) nobody's ever heard of should be the driving force behind ALL technology decisions everywhere.
Reply Quote 1
1 Reply Last reply
FrostCat Discourse touched me in a no-no place last edited by @blakeyrat

@blakeyrat said:
Yes and obviously this site (game?) nobody's ever heard of should be the driving force behind ALL technology decisions everywhere.

In a way similar to the way everyone's heard of "retsupurae", right?

You don't need to be familiar with it for the purpose of the reason I mentioned it.
Reply Quote 0
1 Reply Last reply
boomzilla ♿ (Parody) last edited by @FrostCat

@FrostCat said:
In a way similar to the way everyone's heard of "retsupurae", right?

That always sounds like some kind of gross medical condition.
Reply Quote 4
1 Reply Last reply
David_C last edited by @anonymous234

@anonymous234 said:
why not just call rand() and let the standard library do it for you?

As far as the C/C++ standard is concerned that would be fine. It doesn't specify any algorithm. Unfortunately, POSIX specifies an algorithm, and it's not a very good one. So people on POSIX-compliant systems are SOL if they use rand().

POSIX long ago introduced random() which is a much better algorithm. Why they didn't simply change the algorithm used by rand() (as many non-POSIX platforms did) instead is a good question. I assume they were concerned that some apps might be dependent on the specific algorithm used and didn't want to risk breaking them.

The C++11 approach is an interesting solution to this problem. Instead of providing a single implementation-defined algorithm, there is a basket full of different well-defined algorithms to choose from. Great if you're doing something where it's important, but a bit confusing if you don't care and just want to get a reasonably-random sequence of numbers.
Reply Quote 1
3 Replies Last reply
NedFodder last edited by @David_C

@David_C said:
you don't care and just want to get a reasonably-random sequence of numbers.

That's what std::default_random_engine is for. There's no excuse to call rand() in modern C++.
Reply Quote 2
1 Reply Last reply
cartman82 last edited by

Just call it 2-3 times in a loop, what's the big deal, eh?
Reply Quote 3
1 Reply Last reply
NeighborhoodButcher area_pol last edited by @anonymous234

@anonymous234 said:
Heck, V8 is written in C++, why not just call rand() and let the standard library do it for you?

Because rand() is not a C++ way to generate random numbers. rand() is C and it sucks balls.
Reply Quote 1
1 Reply Last reply
FrostCat Discourse touched me in a no-no place last edited by @David_C

@David_C said:
I assume they were concerned that some apps might be dependent on the specific algorithm used and didn't want to risk breaking them.

Historically, that seems to be the reason for a lot of seemingly strange decisions by both the C and C++ committees.
Reply Quote 1
1 Reply Last reply
David_C last edited by @FrostCat
@FrostCat said:
@David_C said:
I assume they were concerned that some apps might be dependent on the specific algorithm used and didn't want to risk breaking them.

Historically, that seems to be the reason for a lot of seemingly strange decisions by both the C and C++ committees.

Looking a bit further, this situation (rand) appears to not be the fault of any standards body but a de-facto standard by C library implementers who decided for some reason (probably backward compatibility) that it would be better to leave the broken alorithm in-place and tell developers to use a new function for new applications.

ISO-9899:2011 (C language) doesn't mandate any particular algorithm for rand() - it only says (section 7.22.2.1):
- There is no requirement to avoid race conditions with other pseudo-random functions
- The implementation shall behave as if no library function calls rand
- RAND_MAX shall be at least 32767
C++'s ISO standard doesn't mention rand() at all. Its support is only implied by the fact that <cstdlib> is meant to wrap the entire contents of C's <stdlib.h>.

I had thought POSIX specified an implementation, but when I looked up [URL=http://pubs.opengroup.org/onlinepubs/9699919799/functions/rand.html]that standard[/URL], the definition is a clone of the C language definition, and explicitly states that any differences are unintentional.

So there really is no excuse for a compiler/library to ship with the well-known-broken version of rand() that is commonly (but not universally) shipped. Fortunately, it appears that those implementations that use the broken algorithm also include the set of POSIX random number functions as well, so portable code written in C (or that needs to be portable to a pre-11 C++ compiler) can work around the problem without providing a private implementation. (For example, write a wrapper function that either calls random() or rand() depending on what's available) But there is no good reason why developers should have to do this.

Looking a bit further, it appears that the Linux community did decide to fix this at long last. The [URL=http://linux.die.net/man/3/rand]man page[/URL] states that rand() and random() use the same generator these days and that only older implementations use the broken algorithm.

Of course, neither of these algorithms is cryptographically secure. But that's not a problem if the number are not being used for security purposes.
Reply Quote 1
1 Reply Last reply
Gurth last edited by

That “before” image looks a lot better to replicate snow on a TV screen than pure random noise does.
Reply Quote 2
1 Reply Last reply
FrostCat Discourse touched me in a no-no place last edited by @David_C

@David_C said:
Looking a bit further, this situation (rand) appears to not be the fault of any standards body but a de-facto standard by C library implementers who decided for some reason (probably backward compatibility) that it would be better to leave the broken alorithm in-place and tell developers to use a new function for new applications.

Yeah. I'm just saying, whenever you see something weird like that ("why wasn't this fixed?") in C, it probably was due to a major customer of a major compiler vendor who "needed" it not to change after it was discovered to be broken. Because from what I've heard over the years, that happens a lot, and therefore much of the UB in C is left in because someone's application depended on it decades ago.
Reply Quote 2
1 Reply Last reply
N
nexekho last edited by @blakeyrat

If you were using noise1() or something in the fragment shader, that wouldn't have been using this; instead, it's up to the driver vendor for your GPU. It's so inconsistent and frequently flat out bad that most shaders use a texture when they need noise.
Reply Quote 1
1 Reply Last reply
cvi last edited by @nexekho

@nexekho said:
frequently flat out bad

To translate "flat out bad" into something more tangible: most GL implementations have noise*() functions that just always return 0.
Reply Quote 5
1 Reply Last reply
dkf Discourse touched me in a no-no place last edited by @David_C

@David_C said:
I assume they were concerned that some apps might be dependent on the specific algorithm used and didn't want to risk breaking them.

Some people doing Monte Carlo simulation techniques want random numbers, yet exactly reproducible results so that others can redo their work and get the same answers out, down to the bit.

I do not think the word “random” means what they think it means.
Reply Quote 1
1 Reply Last reply
cvi last edited by @dkf

@dkf said:
Some people doing Monte Carlo simulation techniques want random numbers, yet exactly reproducible results so that others can redo their work and get the same answers out, down to the bit.

This is also true of most procedural methods (procedural terrain, textures, geometry, whatever). For the same input sequence, you want to get the same outputs.

The above-mentioned noise*() was clearly intended for this purpose. It doesn't have any internal state, rather the user passes some value to it and noise*() returns a some other value with some properties.

As for Monte Carlo simulations ... you should kinda get the same answers regardless of the sequence of random numbers, right? Because otherwise it's sort-of bad (like, not properly converged).
Reply Quote 0
1 Reply Last reply
dkf Discourse touched me in a no-no place last edited by @cvi

@cvi said:
As for Monte Carlo simulations ... you should kinda get the same answers regardless of the sequence of random numbers, right? Because otherwise it's sort-of bad (like, not properly converged).

Yes. But a lot of people who use MC techniques also want exact reproducibility (as opposed to just to some probably-physically-reasonable number of significant figures). That the concepts are opposed to each other in a deep way doesn't seem to impinge upon their thought processes.
Reply Quote 0
A 2 Replies Last reply
boomzilla ♿ (Parody) last edited by @dkf

@dkf said:
That the concepts are opposed to each other in a deep way doesn't seem to impinge upon their thought processes.

Meh...if you're using pseudo random numbers, you're using pseudo random numbers.
Reply Quote 0
1 Reply Last reply
A
anonymous234 last edited by @dkf

The concepts are not opposed, if you define the algorithm and the starting state, you can get the same sequence of (for all intents and purposes) random numbers.

But you have to make sure the library actually guarantees the same output for the same seed, even across platforms. 90% of the time it does not, but since it happens to work on the computer they test it on, people just assume it will work everywhere.
Reply Quote 0
1 Reply Last reply
Ashley_Sheridan last edited by

Well, looking at those images actually illustrates nothing. Both are completely valid examples of a random output, it's just that people tend to spot patterns in anything and deem anything with a pattern as not random.

Now I'm not saying that the original Math.random() implementation was fine, just that the example image that's meant to show the problem isn't really showing anything.
Reply Quote 1
J 1 Reply Last reply
J
jmp last edited by @Ashley_Sheridan

From the link:
"The more significant upper half of the result is almost entirely dependent on the value of state0. The period length would be at most 232, but instead of few large permutation cycles, there are many short ones. With a badly chosen initial state, the cycle length could be less than 40 million." (my bold)

"It fails many statistical tests in the TestU01 suite."

There is an actual problem here.

If your randomness troll just says "9" all the time, well maybe it's using a random process with uniform probability to generate its results, but it probably isn't. If your random numbers have structure when viewed through some lense, well there goes your uniform probability.

Looking at the code for the visualization: http://bl.ocks.org/mmalone/bf59aa2e44c44dde78ac it looks like it's doing something vaguely similar to the spectral test: https://en.wikipedia.org/wiki/Spectral_test . Looks like there's some correlation between two random draws one after the other; that causes the 'lines' in the image.

tl;dr if you have a = random(); b = random() it's desirable that knowing a shouldn't let you infer much about what b is; the lines in the image represent information you can infer about what b is.
Reply Quote 0
1 Reply Last reply
Ashley_Sheridan last edited by @jmp

If I roll a die 100 times in a row and get 6's each time, that is still random and doesn't mean it's weighted. I'm saying that you can't tell if something is truly random from the results only. You have to look at the underlying implementation of the random engine to see if it is giving random results (to within acceptable parameters, these are computers after all).
Reply Quote 0
J 3 Replies Last reply
Nocha last edited by @Ashley_Sheridan

@Ashley_Sheridan said:
If I roll a die 100 times in a row and get 6's each time, that is still random and doesn't mean it's weighted.

that might be the case, but I can pretty much guarantee that if you manage it in just about any casino you won't be welcome back. An outcome like that could be random, but is far more likely to be foul play.
Reply Quote 0
1 Reply Last reply
J
jmp last edited by @Ashley_Sheridan

The probability of that outcome is 1/699 (assuming it's any particular number 100 times in a row, not just 6), so a Bayes factor of ~1078 towards a perfectly-weighted die that always rolls a particular number. I'm not saying that's /certain/, I'm just saying if you had rolled a d6 a hundred times every second since the start of the universe you still wouldn't expect it to have happened.

If you can compress a long sequence of numbers, it was almost certainly not generated by a random process.

Yes this is statistical, but it's a pretty good bet.
Reply Quote 1
2 Replies Last reply
Ashley_Sheridan last edited by @Nocha

That's the thing! It is a perfectly random possibility, but people have a bad habit of seeing patterns where none exist!
Reply Quote 0
1 Reply Last reply
Ashley_Sheridan last edited by @jmp

@jmp sorry, but you just don't understand probability then. Any series of 100 dice rolls, there's the same chance of 100 6's as any other combination. Any combination has the same 1/6**99 (assuming your math is correct). It's a fallacy that random numbers ought to be random by a general persons observation. There's a big difference between actual random and perceived random.
Reply Quote 0
J A 4 Replies Last reply
J
jmp last edited by @Ashley_Sheridan

The probability I gave was for "every number the same", not for "the specific sequence 6, 6, 6...6". The distinction is important; it dodges the issue where any given sequence has a very low likelihood.

Thinking about this in terms of 'compressibility' is a better way; a sequence of 100 6s is very compressible, an actually random sequence is much less so.

The image on the left is more compressible than the image on the right, because you can exploit the lines.

Random number generators that produce output like that do not pass statistical tests for randomness. That's what the statistical tests /look for/.
Reply Quote 0
1 Reply Last reply
cvi last edited by @Ashley_Sheridan

No, I'd still say that @jmp is correct. Given any 100-long sequence of numbers ∈ {1..6}, the probability of observing that sequence is around 1e-78 with unbiased dice.

Essentially, if I give you such a sequence and you then roll your dice and repeat that sequence, it would be a good assumption that your dice are biased because it's very unlikely (prob = 1e-78) that you would roll that sequence with unbiased dice.

The fact that the example sequence is a repetition of a single number doesn't matter.
Reply Quote 0
1 Reply Last reply
A
anonymous234 last edited by @Ashley_Sheridan

@Ashley_Sheridan said:
@jmp sorry, but you just don't understand probability then. Any series of 100 dice rolls, there's the same chance of 100 6's as any other combination. Any combination has the same 1/6**99 (assuming your math is correct). It's a fallacy that random numbers ought to be random by a general persons observation. There's a big difference between actual random and perceived random.

Here:

Kolmogorov complexity - Wikipedia

Algorithmic probability - Wikipedia

Randomness is a mathematically well-defined concept, even if hard to actually make deductions about it in practice.

And deductions about an algorithm from its output can be done, the same way we make deductions about the underlying "algorithm" of the universe by looking at its output, with bayesian inference.

If there's a simple algorithm A that produces an output with relatively high probability, and another simple algorithm B that produces the same output with very low probability, and we observe that output somewhere, we can say that algorithm A is much more likely to be the one behind it.

For example, since the probability of a fair die giving the 100 sixes in a row is 0.0000000000000000000000000000000000000000000000000000000000000000000000000000015, and let's say the probability of that random die you bought is somewhere being weighted is in the order of 0.01%, then we can say with 99.999999999999999999999999..9% confidence that the second hypothesis is the true one. That level of confidence is much higher than many things we already assume to be true (like your RAM not spontaneously flipping some bits and returning entirely different results when you run the function) so it's entirely fine to say we "know" the die to be loaded.

So TL;DR: we can see patterns in the first picture because there are patterns, and we can deduce the algorithm is not random from that.
Reply Quote 2
1 Reply Last reply
xaade last edited by

If you squint your eyes at the first one, you'll see Abraham Lincoln.
Reply Quote 7
1 Reply Last reply
dkf Discourse touched me in a no-no place last edited by @anonymous234

@anonymous234 said:
That level of confidence is much higher than many things we already assume to be true (like your RAM not spontaneously flipping some bits and returning entirely different results when you run the function) so it's entirely fine to say we "know" the die to be loaded.

No, we just have to consider whether we believe that the dice are loaded or that we've seen a very rare event. In the space of all possible universe configurations, both hypotheses have non-zero probability. One way to determine this is to observe even more rolls; if the sixes keep coming, the confidence in the “loaded” possibility grows, whereas if there is a range of numbers being generated then it looks more like we just observed something rare.

Or that someone's doctored the dice so that the loading of them can be disabled remotely.
Reply Quote 0
1 Reply Last reply
Gurth last edited by @Ashley_Sheridan

@Ashley_Sheridan said:
If I roll a die 100 times in a row and get 6's each time, that is still random and doesn't mean it's weighted.

It might instead mean that reality is defective:
Reply Quote 2
1 Reply Last reply
xaade last edited by @Gurth

what the...?

binary die?
Reply Quote 0
A 2 Replies Last reply
A
aliceif last edited by @xaade

d4
Reply Quote 0
1 Reply Last reply
xaade last edited by @aliceif

oh, so the side facing down is the number

This one is more intuitive
Reply Quote 1
2 Replies Last reply
Maciejasjmj last edited by @xaade

@xaade said:
binary die?

Also known as "a coin".
Reply Quote 9
1 Reply Last reply

55
Posts

3266
Views

Log in to reply

1
2

1 / 2