Distance



  • Some of parts of our system have to determine the distance between two points based on the curvature of the Earth. Here is the representation of a lat/long point on the surface of the Earth and the algorithm (as it is) for calculating that distance.

    public class Point extends Object implements Serializable {
    

    private float latitude = 0;
    private float longitude = 0;

    // omitted obvious getters and setters

    public float distance(Point other) {
    float latStart = 0f;
    float latEnd = 0f;
    float lonStart = 0f;
    float lonEnd = 0f;
    double distance = 0f;

    if ( other != null ) {
      latStart = (float) java.lang.StrictMath.toRadians(getLatitude());
      latEnd = (float) java.lang.StrictMath.toRadians(other.getLatitude());
      lonStart = (float) java.lang.StrictMath.toRadians(getLongitude());
      lonEnd = (float) java.lang.StrictMath.toRadians(other.getLongitude());
      // value of 3440 is used as the radius of the Earth in nautical miles
      distance = 3440f * java.lang.StrictMath.acos(( java.lang.StrictMath.sin(latStart) * java.lang.StrictMath.sin(latEnd)) +
                         (java.lang.StrictMath.cos(latStart) * java.lang.StrictMath.cos(latEnd) *
                         java.lang.StrictMath.cos(lonEnd - lonStart)));
    
    }
    return (float) distance;
    

    }
    }

    The WTFs I see:

    • Downcasting to float at every calculation losing precision everywhere.
    • Checking for other's null-ness after initializing variables (old C-coders built this system.)
    • Why the hell didn't you just import java.lang.StrictMath for clarity?
    • Minor, 3440f as a magic number, explained in comments, but not a 'named constant'

    I don't know about the trig, but it seems to me that downcasting doubles to floats throughout this calculation will lead to spurious results, eh? As a matter of fact, our software creates new records in the database when the distance is greater than some threshold. Apparently we have some 3,000,000 extra records because this routine returns NaN (yay, strict math?) when the points are REALLY close to one another and, of course, the programmer didn't expect that and therefore the program doesn't know how to deal with NaN ... Suffice it to say, the comparison fails, and new records are created.

    I won't share the 'parseLatitude(String)' and 'parseLongitude(String)' methods as they are hideously long a frought with peril.

    (Note: I had to transcribe this by hand, so errors and/or typos may have been introduced not related to the overall WTF-iness.)



  • @zelmak said:

    Why the hell didn't you just import java.lang.StrictMath for clarity?

    If it's in the java.lang package, you don't even need to import it — you can just use the name StrictMath as is, as long as there's no other class with that name floating around.


  • Discourse touched me in a no-no place

    Almost but not quite entirely unlike Haversine then?



  • @zelmak said:

    As a matter of fact, our software creates new records in the database when the distance is greater than some threshold.

    Why don't you just put the two geo-points in the DB and have it do the distance calculation?



  • @blakeyrat said:

    @zelmak said:
    As a matter of fact, our software creates new records in the database when the distance is greater than some threshold.

    Why don't you just put the two geo-points in the DB and have it do the distance calculation?

     

    Because some people think that spatial is special.



  • @blakeyrat said:

    @zelmak said:
    As a matter of fact, our software creates new records in the database when the distance is greater than some threshold.
    Why don't you just put the two geo-points in the DB and have it do the distance calculation?

    If the DB is doing the distance calculations then you need to store all of the point pairs rather than just the ones over a threshold.  I'd guess that most of the pairs are under the threshold based on how zelmak was describing it, but you may be right about that being the right way to do it.



  • @locallunatic said:

    If the DB is doing the distance calculations then you need to store all of the point pairs rather than just the ones over a threshold.  I'd guess that most of the pairs are under the threshold based on how zelmak was describing it, but you may be right about that being the right way to do it.

    Well you don't necessarily have to store the points to have the DB calculate the distance, although it is admittedly slightly WTF-y to compose a SQL query purely to run some math.

    Then again, since this team demonstrably can't do the math themselves, outsource it to someone who can.



  • @PJH said:

    Almost but not quite entirely unlike Haversine then?

    Indeed, this is the spherical law of cosines which is known to have computational issues when the distance is small and the computational precision is limited.  Haversine would be much better...



  • Here, just in case some moron googling for this, tries to use that filthy code

    public static Double getDistance(
        		final Double l1, final Double g1, 
        		final Double l2, final Double g2 ) {
    
        final Double theta =  g1 - g2;
        final Double d1 = l1 * Math.PI / 180.0;
        final Double d2 = l2 * Math.PI / 180.0;
        final Double d3 = theta * Math.PI / 180.0;
        Double dist = Math.sin(d1) * Math.sin(d2) + Math.cos(d1) * Math.cos(d2) * Math.cos(d3);
        // distance in km
        dist = (Math.acos(dist) * 6356.752);
        
        if(Double.isNaN(dist)){
            dist = 0.0;
        }
        return dist;
    }
    



  • @blakeyrat said:

    @locallunatic said:
    If the DB is doing the distance calculations then you need to store all of the point pairs rather than just the ones over a threshold.  I'd guess that most of the pairs are under the threshold based on how zelmak was describing it, but you may be right about that being the right way to do it.
    Well you don't necessarily have to store the points to have the DB calculate the distance, although it is admittedly slightly WTF-y to compose a SQL query purely to run some math.

    Then again, since this team demonstrably can't do the math themselves, outsource it to someone who can.

    You are kidding, right? The original design of this system (from the bits and bobs I can extract from code comments and sporadic CVS check in logs - editorial note: "check-in to fix ticket #12345" is NOT an adequate commit log entry - the documentation of design is long-lost) used flat files then moved to Oracle. As you can guess, Oracle just stores stuff; it may as well be a file system. The code queries the database for stuff in huge swaths of SELECTs and JOINs and sub-SELECTs and WHERE-IN clauses. The results of these queries are iterated by our Java code only to discard records which don't meet further criteria with filters and what they call 'rules.'

    I would have to estimate that 90-95% of the work is done in the Java code; there are some stored procedures, but mostly those are auditing TRIGGERs which timestamp records with date/time changed and by username.

    Foreign keys are often ignored; referential integrity is near zero because of what constraints are available, most are disabled due to performance conditions (who knew it would add 5 seconds to an insert query when you enabled a UNIQUE constraint on the 'natural' but not primary key? Without it, we were storing the same record numerous times.) The list goes on. Surprisingly, the app works pretty well, even with all the WTFness in/around the system.

    I should get permission to write the whole debacle up as a case-study in how things can go wrong ...

     



  • Yeah, Math.acos tends to return NaN for values resulting from rounding errors like 1.000001 (points next to each other) or -1.000001 (points on different sides of the earth). The cleanest way to fix that formula is to save the expression inside the acos into a separate variable and clamp it to -1 .. 1 interval before applying acos on it. And it does not matter if you use StrictMath or Math - both will lose precision, it is just that StrictMath's loss of precision is platform independent, at the expense of being less precise and slower.



  • @zelmak said:

    The WTFs I see:

    • Downcasting to float at every calculation losing precision everywhere.
    Using StrictMath even though the result is then cast to float. If you're throwing away 29 bits of mantissa then you really don't care whether the error is guaranteed to be no more than 1ulp.

    @ubersoldat said:

    Here, just in case some moron googling for this, tries to use that filthy code

    public static Double getDistance(
        		final Double l1, final Double g1, 
        		final Double l2, final Double g2 ) {
        	
            final Double theta =  g1 - g2;
            final Double d1 = l1 * Math.PI / 180.0;
            final Double d2 = l2 * Math.PI / 180.0;
            final Double d3 = theta * Math.PI / 180.0;
            Double dist = Math.sin(d1) * Math.sin(d2) + Math.cos(d1) * Math.cos(d2) * Math.cos(d3);
            // distance in km
            dist = (Math.acos(dist) * 6356.752);
            
            if(Double.isNaN(dist)){
                dist = 0.0;
            }
            return dist;
        }
    
    Yours isn't much better. You've swapped narrow primitives for wide boxed objects rather than wide primitives; you've replaced a library call which does one multiplication with a hand-rolled degree-to-radian conversion which does a multiplication and a division; you've changed from returning a value based on the mean Earth radius in nautical miles to one based on the polar Earth radius in km with only a comment indicating that you use kilometres; and you've handled out-of-range input to acos by assuming that it's always more than one, so that when the points are almost antipodal they may sometimes claim to be next to each other.


  • Quick question for the GIS guys: how much does the fact that the Earth is not perfectly spherical throw off these calculations?



  • @Nexzus said:

    Quick question for the GIS guys: how much does the fact that the Earth is not perfectly spherical throw off these calculations?

    Spherical?  Wait!  I thought it was flat!

     


  • ♿ (Parody)

    @da Doctah said:

    @Nexzus said:

    Quick question for the GIS guys: how much does the fact that the Earth is not perfectly spherical throw off these calculations?

    Spherical?  Wait!  I thought it was flat!

     

    Flat is consistent with "not perfectly spherical." What's the problem?


  • BINNED

    @mihi said:

    And it does not matter if you use StrictMath or Math - both will lose precision, it is just that StrictMath's loss of precision is platform independent, at the expense of being less precise and slower.

    Well, at least that explains something. Not knowing what it does I assumed it was for high precision math. Which would be another WTF considering that afaict the values aren't anywhere near exact anyway when you don't use the (stupidly complicated) formula for earth's geoid shape.



  • @Nexzus said:

    Quick question for the GIS guys: how much does the fact that the Earth is not perfectly spherical throw off these calculations?

    Go read the page I linked to above if you want "confirmation" but anywhere up to 0.55% or so (but usually less than 0.3%)


  • Discourse touched me in a no-no place

    @Nexzus said:

    Quick question for the GIS guys: how much does the fact that the Earth is not perfectly spherical throw off these calculations?

    Are you talking about the fact that the earth is an oblate spheroid, or the fact that it's not smooth? If the former, see the FAQ on the same site as Hmmmm's link - see 5.1b.



    If the latter then you need to start looking into the geoid of the earth.



  • @boomzilla said:

    @da Doctah said:
    @Nexzus said:
    Quick question for the GIS guys: how much does the fact that the Earth is not perfectly spherical throw off these calculations?
    Spherical? Wait! I thought it was flat!
    Flat is consistent with "not perfectly spherical." What's the problem?
    Well, it depends on where your parser assumes the implicit parentheses to belong...

    (not perfectly) spherical

    versus

    not (perfectly spherical)

    makes quite a difference in meaning.


  • ♿ (Parody)

    @Anonymouse said:

    @boomzilla said:

    @da Doctah said:
    @Nexzus said:
    Quick question for the GIS guys: how much does the fact that the Earth is not perfectly spherical throw off these calculations?
    Spherical? Wait! I thought it was flat!
    Flat is consistent with "not perfectly spherical." What's the problem?
    Well, it depends on where your parser assumes the implicit parentheses to belong...

    (not perfectly) spherical

    versus

    not (perfectly spherical)

    makes quite a difference in meaning.

    Yes, but both meanings are still not inconsistent with flat. It's like how cold winters and mild winters are not inconsistent with Global Warming.



  • @boomzilla said:

    Anti-no, although not all meanings except these two are non-discontinually not inconsistent with unround.
     

    Am I doing it right?



  • @boomzilla said:

    @Anonymouse said:
    "(not perfectly) spherical" versus "not (perfectly spherical)" makes quite a difference in meaning.
    Yes, but both meanings are still not inconsistent with flat.
    I would agree on the second meaning. That would match any object that is not a perfect sphere - like your head, a potato, my desk lamp or indeed something flat. But - and that is the point I was trying to make - the first meaning would only match something that is still roughly (yet not perfectly) spherical - like your head or a potato, but not my desk lamp, and definitely not anything that a sane person might reasonably consider "flat".


  • ♿ (Parody)

    @dhromed said:

    @boomzilla said:
    Anti-no, although not all meanings except these two are non-discontinually not inconsistent with unround.

    Am I doing it right?

    Uh...probably? I haven't had coffee yet, so I'm having trouble unwinding your words.


  • ♿ (Parody)

    @Anonymouse said:

    @boomzilla said:
    @Anonymouse said:
    "(not perfectly) spherical" versus "not (perfectly spherical)" makes quite a difference in meaning.

    Yes, but both meanings are still not inconsistent with flat.

    I would agree on the second meaning. That would match any object that is not a perfect sphere - like your head, a potato, my desk lamp or indeed something flat. But - and that is the point I was trying to make - the first meaning would only match something that is still roughly (yet not perfectly) spherical - like your head or a potato, but not my desk lamp, and definitely not anything that a sane person might reasonably consider "flat".

    I'm not sure what shape is "less perfectly" spherical than something that is flat. Sorry, I was just making fun of the weasel wording that is "consistent with." I'll stop now. This has already gone several posts too far.


  • BINNED

    @boomzilla said:

    I'm having trouble unwinding your words.
    I thought that was the point.


Log in to reply