Windows event log viewer



  • @Sutherlands said:

    @TheCPUWizard said:

    My intent was to indicate that it applied when the sltiples of et of numbers starts at 1 (or 0), increments by 1, and has a uniform distribution.

    That's still not what it applies to.  The distribution is nothing alike.

    Of course it is. If you think I am wrong, explain how "1" could possibly be more common in numbers, if all of the numbers in question are:

        * Multiples of 5
    or
        * Non-Prime

    The increased probability of "1" (and other low digits relative to higher digits) applies when the distribution is even, but the range of number may be limited such that not all n-digit possible numbers are in the set, but a contigous range is.



  • @TheCPUWizard said:

    @Sutherlands said:

    @TheCPUWizard said:

    My intent was to indicate that it applied when the sltiples of et of numbers starts at 1 (or 0), increments by 1, and has a uniform distribution.

    That's still not what it applies to.  The distribution is nothing alike.

    Of course it is. If you think I am wrong, explain how "1" could possibly be more common in numbers, if all of the numbers in question are:

        * Multiples of 5
    or
        * Non-Prime

    The increased probability of "1" (and other low digits relative to higher digits) applies when the distribution is even, but the range of number *may* be limited such that not all n-digit possible numbers are in the set, but a contigous range is.

    learn2read?  First off, I'm not arguing what you're trying to pin me on.  Second, {10, 15, 20} (just for a quick disproof, although this really doesn't have to do with anything).

    Have you looked at the distribution curve for Benford's law?  It's 30%, 18%, 13%, 10%, 8%... etc for the respective numbers.  Find me one counting sequence that follows that distribution.  Benford's law is for natural sequences, and is best expressed over multiple orders of magnitude.  Things like... finding possible fraud on income statements.  It has much more to do with exponential sequences such as growth due to interest than any incremental distribution.



  • @Sutherlands said:

    learn2read? First off, I'm not arguing what you're trying to pin me on. Second, {10, 15, 20} (just for a quick disproof, although this really doesn't have to do with anything).

    Have you looked at the distribution curve for Benford's law? It's 30%, 18%, 13%, 10%, 8%... etc for the respective numbers. Find me one counting sequence that follows that distribution. Benford's law is for natural sequences, and is best expressed over multiple orders of magnitude. Things like... finding possible fraud on income statements. It has much more to do with exponential sequences such as growth due to interest than any incremental distribution.

    I am not trying to pin anything on you, and I have already admitted the use of the mathematical term "counting sequence" was incorrect on my part. MY point was that one must be careful what it is applied to (for example, with the number set I provided, you get a distribution of

    <colgroup><col style="width: 48pt;" span="10" width="64">
    <font face="Calibri">50.00%</font><font face="Calibri">14.29%</font><font face="Calibri">14.49%</font><font face="Calibri">14.49%</font><font face="Calibri">14.49%</font><font face="Calibri">50.00%</font><font face="Calibri">14.49%</font><font face="Calibri">14.49%</font><font face="Calibri">0.00%</font><font face="Calibri">0.00%</font>



  • @TheCPUWizard said:

    In the USA these range from 5mph to 75mph...

    Texas has a few roads with 80mph speed limits. And they have the statutory ability to set 85mph limits, the highway department just has yet to do so.



  • @TheCPUWizard said:

    (for example, with the number set I provided, you get a distribution of

    <FONT face=Calibri>50.00%</FONT> <FONT face=Calibri>14.29%</FONT> <FONT face=Calibri>14.49%</FONT> <FONT face=Calibri>14.49%</FONT> <FONT face=Calibri>14.49%</FONT> <FONT face=Calibri>50.00%</FONT> <FONT face=Calibri>14.49%</FONT> <FONT face=Calibri>14.49%</FONT> <FONT face=Calibri>0.00%</FONT> <FONT face=Calibri>0.00%</FONT>
    I'm confused... you're saying this DOES follow Benford's law, which has a distribution of 30%, 18%, 13%, 10%, 8%...?


  • @Sutherlands said:

    @TheCPUWizard said:

    (for example, with the number set I provided, you get a distribution of

    <font face="Calibri">50.00%</font> <font face="Calibri">14.29%</font> <font face="Calibri">14.49%</font> <font face="Calibri">14.49%</font> <font face="Calibri">14.49%</font> <font face="Calibri">50.00%</font> <font face="Calibri">14.49%</font> <font face="Calibri">14.49%</font> <font face="Calibri">0.00%</font> <font face="Calibri">0.00%</font>

    I'm confused... you're saying this DOES follow Benford's law, which has a distribution of 30%, 18%, 13%, 10%, 8%...?

    I am saying that they do NOT follow Benford's law (because it is an attempt to inappropriately apply the law to this set of numbers). 50% of all numbers in this set wil contain a "0". 50% will contain a "5". 14.49% will contain (each) a "1,2,3,4,5,6,7" and none (Mo<font face="Calibri">rbius's Texas comment being ignored) will contain an "8" or "9".</font>

    Therefore if one analyzed the data, found that it did not follow the distribution of Benford's law, and therefore treated it as "fraudulent data" they would be totally off-base.



  • @morbiuswilters said:

    Texas has a few roads with 80mph speed limits. And they have the statutory ability to set 85mph limits, the highway department just has yet to do so.

    I think they just got rid of the night time speed limits too.



  • @TheCPUWizard said:

    I am not trying to pin anything on you, and I have already admitted the use of the mathematical term "counting sequence" was incorrect on my part. MY point was that one must be careful what it is applied to (for example, with the number set I provided, you get a distribution of

    <font face="Calibri">50.00%</font><font face="Calibri">14.29%</font><font face="Calibri">14.49%</font><font face="Calibri">14.49%</font><font face="Calibri">14.49%</font><font face="Calibri">50.00%</font><font face="Calibri">14.49%</font><font face="Calibri">14.49%</font><font face="Calibri">0.00%</font><font face="Calibri">0.00%</font>

    That's not even the correct distribution (assuming you are applying Benford's law to your speed limit example). You've shown the probability that a number contains a certain digit anywhere in the number, while Benford's law is (primarily) concerned with the leading digit.



  • @barrabus said:

    @TheCPUWizard said:

    I am not trying to pin anything on you, and I have already admitted the use of the mathematical term "counting sequence" was incorrect on my part. MY point was that one must be careful what it is applied to (for example, with the number set I provided, you get a distribution of

    <font face="Calibri">50.00%</font><font face="Calibri">14.29%</font><font face="Calibri">14.49%</font><font face="Calibri">14.49%</font><font face="Calibri">14.49%</font><font face="Calibri">50.00%</font><font face="Calibri">14.49%</font><font face="Calibri">14.49%</font><font face="Calibri">0.00%</font><font face="Calibri">0.00%</font>

    That's not even the correct distribution (assuming you are applying Benford's law to your speed limit example). You've shown the probability that a number contains a certain digit anywheret  in the number, while Benford's law is (primarily) concerned with the leading digit.

    What was previously discussed, was that Benford's law applies to all digits, but in the general case to a diminishing degree as one gets further away from the leading digit. What I have been illustrating is that application of this law may not be appropriate if there are constraints (known or unknown) on the underlying data. In this particular case, the fact that all numbers are multiples of 5, and that there is an upper bound, skews the probabilities for all digit positions, and even elimates any differentation for the first digit provided it is between 1 and 7 (8 and 9 will never occur)

    The application of any statistical (using the term loosely) analysis in a manner that does not take into account the undrlying data, is probably (grin) the single largest abuse, and responsible for much of the "numerical garbage" that floats around society.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.