71
votes

I need to implement a Geo proximity search in my application but I'm very confused regarding the correct formula to use. After some searches in the Web and in StackOverflow I found that the solutions are:

  1. Use the Haversine Formula
  2. Use the Great-Circle Distance Formula
  3. Use a Spatial Search Engine in the Database

Option #3 is really not an option for me ATM. Now I'm a little confused since I always though that the Great-Circle Distance Formula and Haversine Formula were synonymous but apparently I was wrong?

Haversine Formula

The above screen shot was taken from the awesome Geo (proximity) Search with MySQL paper, and uses the following functions:

ASIN, SQRT, POWER, SIN, PI, COS

I've also seen variations from the same formula (Spherical Law of Cosines), like this one:

(3956 * ACOS(COS(RADIANS(o_lat)) * COS(RADIANS(d_lat)) * COS(RADIANS(d_lon) - RADIANS(o_lon)) + SIN(RADIANS(o_lat)) * SIN(RADIANS(d_lat))))

That uses the following functions:

ACOS, COS, RADIANS, SIN

I am not a math expert, but are these formulas the same? I've come across some more variations, and formulas (such as the Spherical Law of Cosines and the Vincenty's formulae - which seems to be the most accurate) and that makes me even more confused...

I need to choose a good general purpose formula to implement in PHP / MySQL. Can anyone explain me the differences between the formulas I mentioned above?

  • Which one is the fastest to compute?
  • Which one provides the most accurate results?
  • Which one is the best in terms of speed / accuracy of results?

I appreciate your insight on these questions.


Based on theonlytheory answer I tested the following Great-Circle Distance Formulas:

  • Vincenty Formula
  • Haversine Formula
  • Spherical Law of Cosines

The Vincenty Formula is dead slow, however it's pretty accurate (down to 0.5 mm).

The Haversine Formula is way faster than the Vincenty Formula, I was able to run 1 million calculations in about 6 seconds which is pretty much acceptable for my needs.

The Spherical Law of Cosines Formula revealed to be almost twice as fast as the Haversine Formula, and the precision difference is neglectfulness for most usage cases.


Here are some test locations:

  • Google HQ (37.422045, -122.084347)
  • San Francisco, CA (37.77493, -122.419416)
  • Eiffel Tower, France (48.8582, 2.294407)
  • Opera House, Sydney (-33.856553, 151.214696)

Google HQ - San Francisco, CA:

  • Vincenty Formula: 49 087.066 meters
  • Haversine Formula: 49 103.006 meters
  • Spherical Law of Cosines: 49 103.006 meters

Google HQ - Eiffel Tower, France:

  • Vincenty Formula: 8 989 724.399 meters
  • Haversine Formula: 8 967 042.917 meters
  • Spherical Law of Cosines: 8 967 042.917 meters

Google HQ - Opera House, Sydney:

  • Vincenty Formula: 11 939 773.640 meters
  • Haversine Formula: 11 952 717.240 meters
  • Spherical Law of Cosines: 11 952 717.240 meters

As you can see there is no noticeable difference between the Haversine Formula and the Spherical Law of Cosines, however both have distance offsets as high as 22 kilometers compared to the Vincenty Formula because it uses an ellipsoidal approximation of the earth instead of a spherical one.

2
I got a similar issue long ago, in a project that i never started.. in my notes i found this formula: AB=sqrt(pow(($Xb-$Xa),2)+pow(($Yb-$Ya),2)));, i never understood exactly what it do.. hope can help you ;)Strae
@DaNieL: Check the theonlytheory answer, he explains when you should use the formula you provided. =)Alix Axel
FWIW: for those who are confused by "3956" that is the radius of earth in miles, however according to wikipedia this should be approximated to 3959 en.wikipedia.org/wiki/Earth_radius should you wish to work with distances in KM that would be 6371 instead. I'm not sure if this impacts your answer @Alix but depending which # you used your units might be off and instead should be in miles.Chris Marisic
If your 'search range' is small enough (< 500 km) you can just use Pythagoras to order the results. The distances won't be acurate (the world isn't flat and that's too bad) but the order of distances will be.Rudie
Btw: isn't orig.lat - dest.lat faulty with coordinates [-180, 180]? What happens if orig.lat = -170 and dest.lat = 170? The distance is 340 deg? No, it's actually just 20. How do you solve that if you're working with actual earth (atlas) coordinates?Rudie

2 Answers

36
votes

The Law of Cosines and the Haversine Formula will give identical results assuming a machine with infinite precision. The Haversine formula is more robust to floating point errors. However, today's machines have double precision of the order of 15 significant figures, and the law of cosines may work just fine for you. Both these formulas assume spherical earth, whereas Vicenty's iterative solution (most accurate) assumes ellipsoidal earth (in reality the earth is not even an ellipsoid - it is a geoid). Some references: http://www.movable-type.co.uk/scripts/gis-faq-5.1.html

It gets better: note the latitude to be used in the law of cosines as well as the Haversine is the geocentric latitude, which is different from geodetic latitude. For a sphere, these two are the same.

Which one is fastest to compute?

In order from fastest to slowest are: law of cosines (5 trig. calls) -> haversine (involves sqrt) -> Vicenty (have to solve this iteratively in a for loop)

Which one is most accurate?

Vicenty.

Which one is best when speed and accuracy are both considered?

If your problem domain is such that for the distances you are trying to calculate, the earth can be considered as flat, then you can work out (I am not going to give details) a formula of the form x = kx * difference in longitude, y = ky * difference in latitude. Then distance = sqrt(dxdx + dydy). If your problem domain is such that it can be solved with distance squared, then you won't have to take sqrt, and this formula will be as fast as you get possibly get. It has the added advantage that you can calculate the vector distance - x is distance in east direction, and y is distance in the north direction. Otherwise, experiment with the 3 and choose what works best in your situation.

12
votes

So you want to:

  • sort records by distance from p0
  • select only records whose distance from p0 is less than r

The trick is that you don't exactly need to compute the great circle distance for that! You can do with any function from a pair of points to a real value that strictly grows with the great circle distance between the points. There are many such functions and some are much faster to compute than the various formulas for the exact great circle distance. One such function is the Euclidean distance in 3D. Converting latitude and longitude to a 3D point on the sphere doesn't involve inverse trigonometric functions.

Once you have x,Y,Z, you can realize that you don't actually need the distance from p0 to your point, because you can as well use the distance from the tangent plane at p0. That distance also strictly grows with the great circle distance, and is computed from X,Y,Z as a linear combination - not even a square root is needed. You just need to precompute the coefficients and the cutoff distance that corresponds to the desired great circle distance.