Random Stupidity



  • I found this deep in the bowels of something written by our offshore counterparts, in it's entirety:

    /**
     * We all know that random number generators aren't truly random.
     * This class generates random numbers a bit more randomly.
     */
    public final class Randomizer {
        // OP Note: not sure why matching square brackets ont' show up in post...
        private static final char [] digits = { '0','1','2','3','4','5','6','7','8','9' };
    
        // Allow this to be instantiated in parallel in multiple threads
        public Randomizer() {
        }
    
        public final double random() {
            final StringBuilder sb = new StringBuilder();
            sb.append(getRandomDigitString());
            sb.append(".");
            sb.append(getRandomDigitString());
            return Double.parseDouble(sb.toString());
        }
    
        private String getRandomDigitString() {
            final StringBuilder sb = new StringBuilder();
            int n = getRandomNumberOfDigits();
            for (int i=0; i<n; i++) {
                sb.append(getRandomDigit());
            }
            if (n==0) {
                sb.append("0");
            }
            return sb.toString();
        }
    
        private char getRandomDigit() {
            return digits[(int)(Math.random() * 10)];
        }
    
        private int getRandomNumberOfDigits() {
            return (int) ((Math.random() * 100) * Math.random());
        }
    }
    


  • Ah, India....

    Reinventing methods poorly since 1999.

    I like how it doesn't even get a seed.

     



  •  You know, I remember discovering QuickBASIC's pseudo-randomness when I was 10 or 11, and even then I didn't think that pseudo-randomizing a pseudo-random random would be randomer.

    I could see this providing a somewhat longer *visible* cycle time between obvious repetitions with a really, really crappy generator, but... yeah. WTF indeed.

    If you really need a true random, just find a modern 200+gb hard drive more than a year old and try to read from a file... 



  • @PeriSoft said:

    I could see this providing a somewhat longer *visible* cycle time between obvious repetitions with a really, really crappy generator, but... yeah. WTF indeed.

     

     

    Well  getRandomNumberOfDigits is more likely to give a small number then a larger so the code is horribly broken. I'm not sure fixing that would make the class a fair randomness generator.



  • @PeriSoft said:

    ... randomer...

     

     Heh.  New tach slang to use at our next dev meeting.  Randomer.



  • @Jonathan Holland said:

    Ah, India....

    Reinventing methods poorly since 1999.

    I like how it doesn't even get a seed.

     

    Hey, it's their fault. They should've asked for good codes.



  • WTF. getRandomNumberOfDigits does not even produce a uniform distribution. Knuth would be proud.

     



  • // Allow this to be instantiated in parallel in multiple threads

    On the risk of introducing even more stupidity, this brings up something I've wondered for some time already: Couldn't you create a pretty good "true" random number generator using a kind of "controlled race condition"?

    Imagine something like this:

    // Thread 1:
    forever {
        number++; // adjust actual formulas for uniform distribution
        sleep(1);
    }

    // Thread 2:
    forever {
        number--;
        sleep(1);
    }

    // Thread 3:
    int getRandomNumber() {
        return number;

    Of course this is still technically deterministic, but would actually depend on so many different variables - many of them changing over time even - that it didn't matter. Or would it?

    Where is the flaw? 



  • @PSWorx said:

    // Allow this to be instantiated in parallel in multiple threads

    On the risk of introducing even more stupidity, this brings up something I've wondered for some time already: Couldn't you create a pretty good "true" random number generator using a kind of "controlled race condition"?

    Imagine something like this:

    // Thread 1:
    forever {
        number++; // adjust actual formulas for uniform distribution
        sleep(1);
    }

    // Thread 2:
    forever {
        number--;
        sleep(1);
    }

    // Thread 3:
    int getRandomNumber() {
        return number;

    Of course this is still technically deterministic, but would actually depend on so many different variables - many of them changing over time even - that it didn't matter. Or would it?

    Where is the flaw? 

     

     

    Well, this would almost always return the same number.

    Better would be taking a cryptographically secure hash of a pseudo random number and the system time.



  • @PSWorx said:

    // Allow this to be instantiated in parallel in multiple threads

    On the risk of introducing even more stupidity, this brings up something I've wondered for some time already: Couldn't you create a pretty good "true" random number generator using a kind of "controlled race condition"?

    Imagine something like this:

    // Thread 1:
    forever {
        number++; // adjust actual formulas for uniform distribution
        sleep(1);
    }

    // Thread 2:
    forever {
        number--;
        sleep(1);
    }

    // Thread 3:
    int getRandomNumber() {
        return number;

    Of course this is still technically deterministic, but would actually depend on so many different variables - many of them changing over time even - that it didn't matter. Or would it?

    Where is the flaw? 

     

     

    So thread 3 returns the number of times A has been scheduled since T minus the number of times B has been scheduled since T? It really would not be very random, and certainly not uniform, since it is extremely biased to 0 on both sides of the number line. It would also depend on the process scheduling algorithm which could vary widely between operating systems, even different versions of the same operating system. I think if you wanted to use the system environment as the input to your RNG there are better properties to use. 

     

     

    And I'm not sure what a "crypto-graphically secure" number really has to do with anything. The way standard RNGs work currently, as I understand it, is that they present very, very long series of numbers, indexed by the seed you supply. The values in the series are not significantly correlated to the value of the seed, and the values of numbers in the same series are not significantly correlated to each other. Most libraries do an excellent job of this, and given those properties, you only need ONE value that will not likely have the exact same value every time it is needed. The system time is an excellent candidate. There's no need to take numbers from multiple sources and marry them with any kind of convoluted mathematical strategy.

     



  • public int random() {

        return 5; //chosen by a fair roll of the dice


     



  • More random randoms? Surely that's what uninitialized globals are for?



  • @vrykoul said:

    public int random() {

        return 5; //chosen by a fair roll of the dice


     



    If you're going to steal a joke, at least do it properly.


  • @PSWorx said:

    On the risk of introducing even more stupidity, this brings up something I've wondered for some time already: Couldn't you create a pretty good "true" random number generator using a kind of "controlled race condition"?
     

    Hey, stupidity is fun! Use the FM synth on the soundcard (if you can) to output white noise, sample that, use the least significant bits of that. That is provided the white noise is created by an analog circuit and not simulated by a RNG digitally or looping the same white noise sample over and over...

    Of course, if you want quantum theory proof randomness delivered via Internet, there is nothing better than John Walker's hotbits at http://www.fourmilab.ch/hotbits/ .  



  • @KNY said:

    @vrykoul said:

    public int random() {

        return 5; //chosen by a fair roll of the dice




    If you're going to steal a joke, at least do it properly.
     

    And if you can't do that, at least make it a bizarre juxtaposition of in-references:

    public int random() {

        return 9; // that's the problem with randomness: you can never be sure.

    }

     

     



  • @PSWorx said:

    On the risk of introducing even more stupidity, this brings up something I've wondered for some time already: Couldn't you create a pretty good "true" random number generator using a kind of "controlled race condition"?

     <hints id="hah_hints"></hints>
    Aside from the fact that this doesn't work, you're falling into the classic programmer trap of worrying about a specific solution instead of the general problem.

    Why do you need a "true" RNG?

    - For encryption?  Any decent crypto library can give you cryptographically secure randomness.
    - For statistical simulations?  No need, the pseudo-random sequences will give the same distribution as "true" randomness, that's the whole point.
    - For fudge-factoring?  Using the system time as your random seed is fine for this.

    I won't even get into the debate about "true" randomness being merely the result of complex chaos systems - suffice it to say that the question is not *is it* random, but *how* random is it.  If it's not possible for an outside observer to predict the next number in the sequence simply by studying the past numbers, then it's random enough for just about any practical purpose.

    Of course the "threads" version doesn't fit this criteria at all - it uses some more-or-less random factors in a way that will produce a highly predictable and uniform distribution around the mean of zero.  Plot millions points from even the most highly uncorrelated data and you'll always end up with a Gaussian distribution.  That's why you're only supposed to randomize [I]once[/I] - keep doing it and you'll actually end up with data that's [I]more[/I] correlated than what you started with.



  • Sorry to revive this thread, but here's a nice WTF: 

    @Aaron said:

    Why do you need a "true" RNG?

    - For encryption?  Any decent crypto library can give you cryptographically secure randomness.
    - For statistical simulations?  No need, the pseudo-random sequences will give the same distribution as "true" randomness, that's the whole point.
    - For fudge-factoring?  Using the system time as your random seed is fine for this.

     

    Have you spotted the WTF?

    Using the system time as random seed means that if you know the day the random generator was run and the system time is accurate to 1/100 of a second,  an attacker only has to try at most 8 million possibilities before he finds the correct sequence. For security, this is worse than a 4-letter password.



  •  So you modify it with something else, like say... the output of  /dev/random.



  • @mendel said:

    an attacker only has to try at most 8 million possibilities before he finds the correct sequence. For security, this is worse than a 4-letter password.
     

    bzzt... 4 letter password loses, by ever so slight a margin, assuming you're allowing mixed case. 26**4 = 456,976.   52**4 = 7,311,616. Close, but still not as good as 1:8,000,000



  • @mendel said:

    Have you spotted the WTF?
     

    That's why it's listed as a "fudge-factor" in the choices. If you're dealing with a system/process where there is "an attacker", malicious or not, then you use the crypto library option as the only possible choice.

    @mendel said:

    an attacker only has to try at most 8 million possibilities before he finds the correct sequence. For security, this is worse than a 4-letter password.
     

    bzzt. 4-letter password loses, by ever so slight a margin, if you're allowing mixed-case passwords. 52**4 = 7,311,616. Close, but not quite 1:8,000,000 



  • // Allow this to be instantiated in parallel in multiple threads
    public Randomizer() {
    }
     
    WTF does that comment even mean?  What does a constructor have to do with multiple threads?  Especially since the only thing this class uses is a static field and local variables.

    At least the class was declared as final.  That makes it extra secure!


  • @MarcB said:

    That's why it's listed as a "fudge-factor" in the choices. If you're dealing with a system/process where there is "an attacker", malicious or not, then you use the crypto library option as the only possible choice.
     

    I had mentioned Hot Bits earlier. Where does the crypto library get its seed from?

     @MarcB said:

    @mendel said:
    an attacker only has to try at most 8 million possibilities before he finds the correct sequence. For security, this is worse than a 4-letter password.
    bzzt. 4-letter password loses, by ever so slight a margin, if you're allowing mixed-case passwords. 52**4 = 7,311,616. Close, but not quite 1:8,000,000 

     

    I was allowing numbers as well, and some symbols people like to use (maybe - and .) for an even 64, which nets me 16M. It's a seat-of-pants estimate anyway (as if being less secure than a 5-letter password was better....), given that often you can determine the system time closer than that and that I don't really know how long a clock tick is. 

    For your continued entertainment, one could derive a random seed from the Time Stamp Counter. (Virtualdub.org : "The time stamp counter is a 64-bit counter that was added to most x86 CPUs starting around the Pentium era, and which counts up at the clock rate of the CPU. The TSC is generally readable via the RDTSC instruction from user mode, making it the fastest, easiest, and most precise time base available on modern machine.") Assuming that the TSC counts at 2 GHz and also assuming that many users start the program within 20 minutes of booting their computer, we're getting 7-letter-password security here (41 bit or thereabouts).

    If you do this on a server, it's uptime becomes a security-critical piece of data.... 



  • @mendel said:

    For your continued entertainment, one could derive a random seed from the Time Stamp Counter. (Virtualdub.org : "The time stamp counter is a 64-bit counter that was added to most x86 CPUs starting around the Pentium era, and which counts up at the clock rate of the CPU. The TSC is generally readable via the RDTSC instruction from user mode, making it the fastest, easiest, and most precise time base available on modern machine.") Assuming that the TSC counts at 2 GHz and also assuming that many users start the program within 20 minutes of booting their computer, we're getting 7-letter-password security here (41 bit or thereabouts).

    If you do this on a server, it's uptime becomes a security-critical piece of data....

    Well, not necessarily. Certain multi-core CPUs will keep that value in sync, but others won't. I don't think Intel has had this issue, but AMD has, which has caused a lot of slightly older games to require being locked to a specific core.

    That, as it happens, is also why using that value for timing purposes is discouraged these days.



  • @mendel said:

    Have you spotted the WTF?

    Using the system time as random seed means that if you know the day the random generator was run and the system time is accurate to 1/100 of a second,  an attacker only has to try at most 8 million possibilities before he finds the correct sequence. For security, this is worse than a 4-letter password.


    That's why he said you'd only use that for something that didn't have to be secure. Please read what he said.



  • @mendel said:

    Have you spotted the WTF?

    Using the system time as random seed means that if you know the day the random generator was run and the system time is accurate to 1/100 of a second,  an attacker only has to try at most 8 million possibilities before he finds the correct sequence. For security, this is worse than a 4-letter password.

    So I guess the answer is to improve security by having all of your computer clocks set to random wildly incorrect times. Either that, or encrypt the clock, I think we can do that. Actually, let's do both.


  • I really like how they are using a StringBuilder for adding the three string bits together. Atleast this code won't cause a perfomance or memory issue...



  • @mendel said:

    I was allowing numbers as well, and some symbols
     

    Well, then it's not a "4 letter" pw anymore. More like "4 character". But that's just quibbling. 

    @mendel said:

    Where does the crypto library get its seed from?
     

    Any crypto library that's not some CS101's class project toy will have its own method of slurping some bits from a large number of different sources, hash them together somehow, and serve up as a cryptographically secure source. And yes, the system time could very well be one of those sources. All depends on the library and how it's implemented.

    Quick and dirty solution is just to read the required number of bits from /dev/urandom and run with it. BSD and Linux (particularly OpenBSD) go to great pains to ensure that urandom is properly seeded and as truly random as can be made with a purely deterministic system. 

     



  • @pscs said:

    @mendel said:
    Using the system time as random seed (...) For security, this is worse than a 4-letter password.
    That's why he said you'd only use that for something that didn't have to be secure. Please read what he said.
     

    Aaron mentioned fudge-factoring, and I honestly can't say (even after looking up "fudge-factor" and "fudge-factoring" - it doesn't help that this thread is the second google result for that) that I understand exactly what he means by that. The system time "solution" only works when you need a different random number run each time your program is run and you're 100% certain there exists no step two in the sequence of

    1. determine/predict random number sequence
    2. ????
    3. Profit!
    If there is, Murphy's Law assures someone else will find it.


  • @Aaron said:

    If it's not possible for an outside observer to predict the next number in the sequence simply by studying the past numbers, then it's random enough for just about any practical purpose.

    Do you mean a specific outside observer, or a random outside observer? Or how random should the outside observer be?



  • @swordfishBob said:

    @Aaron said:
    If it's not possible for an outside observer to predict the next number in the sequence simply by studying the past numbers, then it's random enough for just about any practical purpose.
    Do you mean a specific outside observer, or a random outside observer? Or how random should the outside observer be?
     

    Well, obviously you need to do a test run with a ROG. The ROG output should be uniformly distributed across all possible birthdates,  gaussian with regard to IQ and statistically representative  regarding education. You offer each random observer a prize (one cool million dollars should suffice) for predicting the next number in the sequence. This is called the Monte Carlo method. If you can predict which random observer is going to win the price obviously your ROG is flawed. (That's the case in James Bond movies). If you can prove that no RO will win the price, you might think that then of course the ROG would not be needed; but philosophically you have merely substituted the ROG test run with a test of one nonrandom observer, namely yourself; and since you didn't get one cool million for your effort, your result is going to be seriously in doubt. Some people think that India is stepping up its RO output, with its population growing at twice the rate of the US and most European countries, taking the lead in ROG technology; however, looking at the figures, Liberia's RO output seems even higher, leading outside observers to speculate that terrorists are preparing an attack on Western RNGs from Liberian soil. The UN should immediately send inspectors to Liberia to determine if WMOs are being produced in that country, and destroy any Weapons of Mass Observation it can find, preferably by blowing them sky-high.

    This random post was brought to you by RANDOM: Republicans Against National Defense Observation Mathematics. 



  • @Jonathan Holland said:

    Ah, India....

    Reinventing methods poorly since 1999.

    I like how it doesn't even get a seed.

    What's "seed", precious? 



  • @mendel said:

     For security, this is worse than a 4-letter password.

     

     

     @mendel said:

    I was allowing numbers as well, and some symbols people like to use (maybe - and .)

     

     

    That wouldn't be a 4-letter password then, would it?



  • @MoeDrippins said:

    @mendel said:

     For security, this is worse than a 4-letter password.

     

     

     @mendel said:

    I was allowing numbers as well, and some symbols people like to use (maybe - and .)

     

    That wouldn't be a 4-letter password then, would it?

     

    Right, thanks for reading the other post that said the same thing already.

    @MarcB said:

    Well, then it's not a "4 letter" pw anymore. More like "4 character". But that's just quibbling. 




  • If you want random data, just have the user mash the keyboard -- entropy guarenteed. 



  • @DigitalXeron said:

    If you want random data, just have the user mash the keyboard -- entropy guarenteed. 

     or just send a random number of character codes to a hidden window for x number of charcters and add up the sum and mod your range...



  • @DigitalXeron said:

    If you want random data, just have the user mash the keyboard -- entropy guarenteed.

    Or slurp some data from forums, they're full of users mashing their keyboards.

    Here, have some entropy:
    055a90ubgm8oaräzxdddpaf8b9a4u0zvmlädy s7967qq7q



  • @DigitalXeron said:

    If you want random data, just have the user mash the keyboard -- entropy guarenteed. 

     

    I've seen crypto apps (e.g. PuTTY) that ask you to move your mouse "randomly" to generate entropy, when they need to create a key.



  • @CodeSimian said:

    I've seen crypto apps (e.g. PuTTY) that ask you to move your mouse "randomly" to generate entropy, when they need to create a key.

    Truecrypt does that, too (along with other sources). 

    This is the reason why Michael J Fox is recognised as being a security guru.


Log in to reply