Math formula wizardry



  • I need some math-related help with a converter function.

    Input is an integer on a scale from 0 to infinity. Output is a float in range 0 to 10.

    I'm provided a map, similar to this:

    0-100 -> 0.x
    100-500 -> 1.x
    ...
    800000-999999 -> 9.x
    1000000+ -> 10
    

    Is there some mathematical methodology I can use to turn this into a concise formula?

    Without guidance, I'll just create a series of if conditions and do the linear scale in between these fixed points (or generalize in some way, using an array with ranges, for example).

    But that stinks. There's got to be a better way to solve this.

    Come on academia math nerds. This is your moment to shine! :-)



  • Where exactly does the "x" come from? Also, the steps in-between 1.x and 9.x are not exactly unimportant to know.



  • That's what he's asking!



  • Nope, not what he's asking. He wants a formula but omits some data points which would make it clear how the intervals are scaled.

    Also, if the calculation method for the part after the decimal separator changes depending on the interval he has no choice other than using a series of if conditions.


  • sockdevs

    @cartman82 said:

    0-100 -> 0.x
    100-500 -> 1.x
    ...
    800000-999999 -> 9.x
    1000000+ -> 10

    def Mathersize(value):
        return value / 100000 if value < 1100000 else 10
    

    EDIT: waaaaaaait.......

    no that missies 100-500....

    i think i need the whole sequence for this one.....



  • @accalia said:

    ```
    def Mathersize(value):
    return value / 100000 if value < 1100000 else 10

    
    Value 499 would yield 0.0049 using that formula. Should be 1.9975 assuming linear scaling for the "modulo-alike"


  • Unless there's a fairly simple equation f(x) that describes y = f(x) (which it doesn't look like), you're probably better off just using a series of ifs.

    If I remember correctly (and it's been a long time since I've done even a small amount of work like this), any set of points (x,y) can be approximately described by a combination of sine functions (see https://en.wikipedia.org/wiki/Fourier_transform). The "noisier" the data is, though, the more complicated the function becomes. My guess is that for this simple case, the transform would be far more complicated than you'd want to deal with.


  • sockdevs

    @Rhywden said:

    Value 499 would yield 0.004

    yes i noticed that.....

    too late.... hmmm



  • @Dragnslcr said:

    Unless there's a fairly simple equation f(x) that describes y = f(x) (which it doesn't look like), you're probably better off just using a series of ifs.

    If I remember correctly (and it's been a long time since I've done even a small amount of work like this), any set of points (x,y) can be approximately described by a combination of sine functions (see https://en.wikipedia.org/wiki/Fourier_transform). The "noisier" the data is, though, the more complicated the function becomes. My guess is that for this simple case, the transform would be far more complicated than you'd want to deal with.

    Also, the performance would probably suck - a bunch of sine calculations versus a simple elsif?



  • Jesus you're dense. He doesn't know what the scale should be, let alone a formula for it. Which is why he's asking.



  • @Dragnslcr said:

    If I remember correctly (and it's been a long time since I've done even a small amount of work like this), any set of points (x,y) can be approximately described by a combination of sine functions (see https://en.wikipedia.org/wiki/Fourier_transform).

    Or just interpolate through the points, but I'm not sure there's a point (heh, heh) to that.

    The reasonable thing would probably be to take an index in the map as the integral part, and get the fractional part via linear scale for the given part. Which, yeah, boils down to range checks, unless there's some smarter way of finding which range the point lies in that I'm forgetting.



  • @Captain said:

    Jesus you're dense. He doesn't know what the scale should be, let alone a formula for it. Which is why he's asking.

    If he doesn't know the scale how in the hell are we supposed to make a formula out of it? :expressionless:



  • I think I'd probably use a linear fractional transformation to do what you want. I'll post details later if you can't get it. It takes two divisions and some addition.



  • :facepalm:
    That's what he's asking



  • Y'know, sometimes people are so full of themselves that they don't notice that they're full of shit.



  • I'm confused, is there actually a formula for these, or just each range has its own format?

    If the former, you need to go to whoever delivered these examples and ask what formula was used to derive them. If the latter, I don't think you can avoid an if() or switch() statement. (Well, in a non JS language, you could create like a Dictionary<Tuple<min_int, max_int>, format_string> and loop through that. Probably more effort than it's worth in JS, though.)

    EDIT: awesome Discourse-ism: I can't type a < without typing &<test>lt;, so I do, but then when I hit the "preformatted text" suddenly all my < turn back into &<test>lt; in the output.

    It also appears to be impossible to escape < without the escaping slash appear in the output. Oh wait I could put a HTML tag in it: &<test>lt; awesome. I love Markdown.



  • What I got out of the question is that he doesn't have a formula, and he doesn't care about the scale too much, as long as the various domains get mapped into the various ranges monotonically, and the new scale doesn't go bigger than 10.

    @cartman82: am I wrong here? People are being poopieheads and saying I'm wrong.



  • @Captain said:

    That's what he's asking

    Assuming you're correct, at the very least we still need to know what "x" is in the top three example rows. Because I can't work out what 1.x would end up being if the value is, say, 350. Is it supposed to display 1.350?



  • Not if he doesn't care (too much).



  • @Captain said:

    What I got out of the question is that he doesn't have a formula,

    Right; but someone does (this mysterious person provided cartman82 with the mapping examples, remember? Go read the OP again) so I don't know why we're being asked to find the formula instead of the one person in this scenario who already knows the formula. That's my confusion.

    "Hey there's this guy Bob, he asked me to do a thing. I don't know how to do it, but I know Bob knows how to do it. So I'm going to ask everybody except Bob how to do it."

    Which means I think my second interpretation of the question is the more correct one. In which case, I'd just write a series of if()s.


  • mod

    @Captain said:

    he doesn't care about the scale too much, as long as the various domains get mapped into the various ranges monotonically

    To evenly transform the domain 0-1000000 into 0-10, you just do:

    f(x) {
        return x/100000
    }
    

    To handle numbers greater than 1000000, you add to that an if beforehand, saying if the number is greater than the range, cap it to 1000000.

    But that doesn't match the map given in the OP. The map given in the OP is an uneven range. So you'd need a set of formula specific to the buckets you want to trasform. Something like:

    f(x) {
        if (x < 100) return x/100; //transform 0-100 into 0-1
        if (100 < x < 500) return 1+(x/500) //transform 100-500 into 1-2
        [...]
        if (800000 < x < 999999) return 9+(x/999999)
    }
    

    (Math might not be correct)



  • @Yamikuronue said:

    if (100 < x < 500) return 1+(x/500) //transform 100-500 into 1-2

    Which is a perfect demonstration as to why this problem is unsolvable without defining "x" in the example. I assumed it for a value of 350, the output should be 1.350. Like in a version number. But you assume it should be 1.7.

    Another possibility would be to use this formula:

    1 + ( (x - 100) / 400 )

    Which would be 1.625.



  • @Yamikuronue said:

    if (100 < x < 500) return 1+(x/500) //transform 100-500 into 1-2

    f(98) = 0.98
    f(99) = 0.99
    f(100) = undefined
    f(101) = 1.202
    f(102) = 1.204

    Yeah, that doesn't look right at all.


  • sockdevs

    @Yamikuronue said:

    @Captain said:
    he doesn't care about the scale too much, as long as the various domains get mapped into the various ranges monotonically

    To evenly transform the domain 0-1000000 into 0-10, you just do:

    f(x) {
        return x/100000
    }
    

    To handle numbers greater than 1000000, you add to that an if beforehand, saying if the number is greater than the range, cap it to 1000000.

    But that doesn't match the map given in the OP. The map given in the OP is an uneven range. So you'd need a set of formula specific to the buckets you want to trasform. Something like:

    f(x) {
        if (x < 100) return x/100; //transform 0-100 into 0-1
        if (100 < x < 500) return 1+(x/500) //transform 100-500 into 1-2
        [...]
        if (800000 < x < 999999) return 9+(x/999999)
    }
    

    (Math might not be correct)

    wolfram alpha is of limited help. but given more sample stat points i'm sure i can craft a forumla to calculate this....

    i eventually got to plot (log10(x/111111 + 1)*10) for x=0 through x=100000 which matches the end points but is still wildly off in the middle bits


  • mod

    @Maciejasjmj said:

    that doesn't look right at all.

    But it matches his criteria:

    @cartman82 said:

    0-100 -> 0.x
    100-500 -> 1.x

    E_NEEDS_MORE_DEFINITION





  • @Maciejasjmj said:

    Yeah, that doesn't look right at all.

    As long as we don't know what "x" means, the only value that doesn't match the spec is f(100). And that's just because Yami accidentally used < 100 instead of <= 100, a simple typo.

    So, it looks as "right" as anything else.


  • mod

    @blakeyrat said:

    I assumed it for a value of 350, the output should be 1.350. Like in a version number

    If that happens, you get a gap between 1.500 and... well, there's no next category shown, but assuming it starts with 2.0, you lose half your 1.x decimals.



  • @Yamikuronue said:

    But it matches his criteria:

    Technically correct is the best kind of correct?

    So does

    f(x)
    {
       if (x <= 100) return 0.42
       if (x <= 500) return 1.42
       //etc
    }
    

    but I assume he'd rather have a more monotonous relation.



  • I guess he does specifically say it's a "linear scale". Whatever. I still move the problem's unsolvable as presented.





  • Yeah, but that's just a series of ifs (hardcoded or looped over, doesn't matter), and @cartman82 doesn't want that for... reasons.

    I guess you could try finding an interpolating polynomial for your set, but that's probably much more hassle than it's worth.



  • For a function like that (linear interpolations), no, there is no way to express it that's simpler than a bunch of ifs.

    If any interpolation is OK, then there are ways, for example polynomial interpolation. However those sometimes have wildly unexpected variations. You'd probably something else that's guaranteed to be monotonically increasing.

    I have no idea how to do that, but I can tell you that it will be 100 times faster and easier to understand to use ifs.



  • Wow, this exploded while I was driving home.

    @Rhywden said:

    Nope, not what he's asking. He wants a formula but omits some data points which would make it clear how the intervals are scaled.

    Also, if the calculation method for the part after the decimal separator changes depending on the interval he has no choice other than using a series of if conditions.

    @accalia said:

    i think i need the whole sequence for this one.....

    The ranges are provided by the client. As far as I can tell, he pulled the numbers out of his ass.

    So, I'm sort of hoping for a methodology for finding the formula, more than the formula itself.
    Ideally, an app where I can plug in those ranges and get the formula out.

    @Captain said:

    Jesus you're dense. He doesn't know what the scale should be, let alone a formula for it. Which is why he's asking.

    This.

    @Captain said:

    I think I'd probably use a linear fractional transformation to do what you want. I'll post details later if you can't get it. It takes two divisions and some addition.

    This was the sort of answer I was hoping for. Then I clicked on the link and didn't understand a single word in that article.

    if-s it is.

    @blakeyrat said:

    If the former, you need to go to whoever delivered these examples and ask what formula was used to derive them. If the latter, I don't think you can avoid an if() or switch() statement. (Well, in a non JS language, you could create like a Dictionary<Tuple<min_int, max_int>, format_string> and loop through that. Probably more effort than it's worth in JS, though.)

    PHP. But yeah, that's my fallback solution. Might even be better than some magical formula.

    I was just wondering what would the math guys do in this situation.

    @Captain said:

    What I got out of the question is that he doesn't have a formula, and he doesn't care about the scale too much, as long as the various domains get mapped into the various ranges monotonically, and the new scale doesn't go bigger than 10.

    @cartman82: am I wrong here? People are being poopieheads and saying I'm wrong.

    You are correct.

    @blakeyrat said:

    Right; but someone does (this mysterious person provided cartman82 with the mapping examples, remember? Go read the OP again) so I don't know why we're being asked to find the formula instead of the one person in this scenario who already knows the formula. That's my confusion.

    He doesn't know the formula. He just gave me a scale like I posted in the OP.

    I can tell it's some kind of logarithmic graph. But I don't think the client really sweated out figuring out a formula, then generated 10 significant points and gave me those. There's no "original formula", is what I'm saying.



  • any math tool can get you an interpolating polynomial from a bunch of points.
    the problem it's not the hassle, it is the perfromance of the calculation



  • That actually seems like the most straightforward solution.
    I should probably just do that and forget any kind of fancy formula.



  • @cartman82 said:

    He doesn't know the formula. He just gave me a scale like I posted in the OP.

    You still didn't explain what goes where x is!

    Feh.


  • sockdevs

    @cartman82 said:

    The ranges are provided by the client. As far as I can tell, he pulled the numbers out of his ass.

    That's fine, but i assume he gave you more data points than that. if i had more data points i could do a better formula fit



  • @blakeyrat said:

    You still didn't explain what goes where x is!

    Feh.

    x is the number of events of certain kind. Result is a score on a 0-10 scale describing the frequency of those events.

    @accalia said:

    That's fine, but i assume he gave you more data points than that. if i had more data points i could do a better formula fit

    I intentionally didn't include the real scale.

    As I said, I'm looking for methodology, not a spoon-fed solution.

    I mean, I can't open up a new thread every time they want to change the numbers or add a new scale.


  • sockdevs

    @cartman82 said:

    As I said, I'm looking for methodology, not a spoon-fed solution.

    plug the scale into wolfram alpha and try some fits on for size?

    linear fit {{1,1}{2,4}{3,9}{4,16}}
    quadratic fit {{1,1}{2,4}{3,9}{4,16}}
    cubic fit {{1,1}{2,4}{3,9}{4,16}}
    exponential fit {{1,1}{2,4}{3,9}{4,16}}
    log fit {{1,1}{2,4}{3,9}{4,16}}


  • sockdevs

    @accalia said:

    quadratic fit {{1,1}{2,4}{3,9}{4,16}}

    though to be fair it did come up with x^2-3.35333×10^-15 x+6.62894×10^-15 as the best quadratic best fit for that sequence instead of the exactly correct x^2



  • @cartman82 said:

    x is the number of events of certain kind. Result is a score on a 0-10 scale describing the frequency of those events.

    ?

    So when your map says:

    800000-999999 -> 9.x

    And the value is 850000 (assuming that is the number of events), the result should be 9.850000? That would mean my first guess about treating "x" like a version number was correct, even though I thought that was the least likely to be correct.

    I'm sure at this point Boomzilla is going to burst in here and tell me what an idiot I am for not using my telepathic powers to instantly understand this.



  • x is the number of events of certain kind. Result is a score on a 0-10 scale describing the frequency of those events.

    That's a weird data structure to encode with a single real number, since it's a pair of numbers.

    More on Mobius transformations, if you want something fancy. Do an interpolation on a few of the points. The algorithm will yield a formula that uses two additions and a division to transform the input. Use that formula in your program.


  • Winner of the 2016 Presidential Election

    @cartman82 said:

    As I said, I'm looking for methodology, not a spoon-fed solution.

    You can perform a linear interpolation or a cubic spline interpolation, depending on what kind of properties the resulting function should have. Or do something way more fancy, but I assume you want something simple.



  • @accalia said:

    plug the scale into wolfram alpha and try some fits on for size?

    linear fit {{1,1}{2,4}{3,9}{4,16}}quadratic fit {{1,1}{2,4}{3,9}{4,16}}cubic fit {{1,1}{2,4}{3,9}{4,16}}exponential fit {{1,1}{2,4}{3,9}{4,16}}log fit {{1,1}{2,4}{3,9}{4,16}}

    That's interesting. Didn't occur me to try Wolfram Alpha.

    Unfortunately, it breaks when I enter all my points. But if I pick just a few of them, I can get a simple formula that roughly matches my ranges.


  • Winner of the 2016 Presidential Election

    Try cubic spline interpolation, it might be what you want. (I'm pretty sure Wolfram Alpha can do that for you.)

    In case you've never heard of splines: You basically find cubic functions (instead of linear functions) for each interval, so you get a "smooth" piecewise-defined function.


  • sockdevs

    @cartman82 said:

    Unfortunately, it breaks when I enter all my points.

    a common symptom of ass pull ranges. ;-)


  • Discourse touched me in a no-no place

    @blakeyrat said:

    Because I can't work out what 1.x would end up being if the value is, say, 350. Is it supposed to display 1.350?

    My guess would be it's a linear scale within each range: 100 -> 1.0, 300 -> 1.5, 500 -> 1.99999 or 2.0, depending. We'll have to wait to see if he has an answer (assuming I haven't been :hanzo:d, of course).



  • @cartman82 said:

    The ranges are provided by the client. As far as I can tell, he pulled the numbers out of his ass.

    So, I'm sort of hoping for a methodology for finding the formula, more than the formula itself.Ideally, an app where I can plug in those ranges and get the formula out.

    If linear interpolation is okay, then you probably want something like this. Assuming that input_map and output_map are provided by the client somewhere, and you don't know what they are at compile time...

    var input_map = [0, 100, 500, 800000, 1000000],
       output_map = [0,   1,   2,      9,      10];
    
    // 300 is halfway between 100 and 500; the map puts that halfway between 1 and 2
    map_value(300, input_map, output_map); // returns 1.5
    
    
    function map_value(i, input_map, output_map) {
      if (input_map.length != output_map.length) {
        throw "map lengths must be equal";
      }
    
      var r, j = 0;
      while (j + 1 < input_map.length && input_map[j + 1] <= i) {
        if (input_map[j] >= input_map[j + 1]) {
          throw "input map must be strictly increasing";
        }
        ++ j;
      }
    
      if (!j && i <= input_map[j] || j + 1 >= input_map.length) {
        // i is out of range; return the nearest endpoint from output_map
        r = output_map[j];
      } else {
        // i is within range; apply the standard formula for linear interpolation
        r = output_map[j] + (i - input_map[j])
                              / (input_map[j + 1] - input_map[j])
                              * (output_map[j + 1] - output_map[j]);
      }
    
      return r;
    }
    


  • It seems a bunch of people here are wearing complicator's gloves. Piece-wise linear interpolation using ifs is simple, easy to understand, easy to maintain, and performant. Mobius transformations appear to be performant, but the intent of the code is probably much less apparent to the reader, thus less maintainable. Splines may give a nice, smooth curve, but that is not a requirement of the original spec as presented in the OP, nor any subsequent posts. Also, depending on the type of spline, they may have nasty behavior of not actually passing through the defining control points, or of not being monotonic between them.


Log in to reply
 

Looks like your connection to What the Daily WTF? was lost, please wait while we try to reconnect.