235
votes

I currently have just under a million locations in a mysql database all with longitude and latitude information.

I am trying to find the distance between one point and many other points via a query. It's not as fast as I want it to be especially with 100+ hits a second.

Is there a faster query or possibly a faster system other than mysql for this? I'm using this query:

SELECT 
  name, 
   ( 3959 * acos( cos( radians(42.290763) ) * cos( radians( locations.lat ) ) 
   * cos( radians(locations.lng) - radians(-71.35368)) + sin(radians(42.290763)) 
   * sin( radians(locations.lat)))) AS distance 
FROM locations 
WHERE active = 1 
HAVING distance < 10 
ORDER BY distance;

Note: The provided distance is in Miles. If you need Kilometers, use 6371 instead of 3959.

15
The formula you give seems to have a lot of elements that are constant. Is it possible to pre-compute data and store those values as well in your DB? For example 3959 * acos( cos( radians(42.290763) ) is a constant but has 4 major computations in it. Instead could you just store 6696.7837?Peter M
Or at least pre-compute constants outside of the query? That will cut down on the work that has to be done.Peter M
@Peter M It seems likely that any decent SQL database would optimize so that was computed only once.mhenry1384
For those wondering, 42.290763 is the latitude and -71.35368 is the longitude of the point from which to compute the distances.user276648
Just for info ,Distance caluclated by this formula is in miles ,not in kilometers.Please Replace 3959 to 6371 to get results in kilometersSahil

15 Answers

119
votes
  • Create your points using Point values of Geometry data types in MyISAM table. As of Mysql 5.7.5, InnoDB tables now also support SPATIAL indices.

  • Create a SPATIAL index on these points

  • Use MBRContains() to find the values:

    SELECT  *
    FROM    table
    WHERE   MBRContains(LineFromText(CONCAT(
            '('
            , @lon + 10 / ( 111.1 / cos(RADIANS(@lon)))
            , ' '
            , @lat + 10 / 111.1
            , ','
            , @lon - 10 / ( 111.1 / cos(RADIANS(@lat)))
            , ' '
            , @lat - 10 / 111.1 
            , ')' )
            ,mypoint)
    

, or, in MySQL 5.1 and above:

    SELECT  *
    FROM    table
    WHERE   MBRContains
                    (
                    LineString
                            (
                            Point (
                                    @lon + 10 / ( 111.1 / COS(RADIANS(@lat))),
                                    @lat + 10 / 111.1
                                  ),
                            Point (
                                    @lon - 10 / ( 111.1 / COS(RADIANS(@lat))),
                                    @lat - 10 / 111.1
                                  ) 
                            ),
                    mypoint
                    )

This will select all points approximately within the box (@lat +/- 10 km, @lon +/- 10km).

This actually is not a box, but a spherical rectangle: latitude and longitude bound segment of the sphere. This may differ from a plain rectangle on the Franz Joseph Land, but quite close to it on most inhabited places.

  • Apply additional filtering to select everything inside the circle (not the square)

  • Possibly apply additional fine filtering to account for the big circle distance (for large distances)

102
votes

Not a MySql specific answer, but it'll improve the performance of your sql statement.

What you're effectively doing is calculating the distance to every point in the table, to see if it's within 10 units of a given point.

What you can do before you run this sql, is create four points that draw a box 20 units on a side, with your point in the center i.e.. (x1,y1 ) . . . (x4, y4), where (x1,y1) is (givenlong + 10 units, givenLat + 10units) . . . (givenLong - 10units, givenLat -10 units). Actually, you only need two points, top left and bottom right call them (X1, Y1) and (X2, Y2)

Now your SQL statement use these points to exclude rows that definitely are more than 10u from your given point, it can use indexes on the latitudes & longitudes, so will be orders of magnitude faster than what you currently have.

e.g.

select . . . 
where locations.lat between X1 and X2 
and   locations.Long between y1 and y2;

The box approach can return false positives (you can pick up points in the corners of the box that are > 10u from the given point), so you still need to calculate the distance of each point. However this again will be much faster because you have drastically limited the number of points to test to the points within the box.

I call this technique "Thinking inside the box" :)

EDIT: Can this be put into one SQL statement?

I have no idea what mySql or Php is capable of, sorry. I don't know where the best place is to build the four points, or how they could be passed to a mySql query in Php. However, once you have the four points, there's nothing stopping you combining your own SQL statement with mine.

select name, 
       ( 3959 * acos( cos( radians(42.290763) ) 
              * cos( radians( locations.lat ) ) 
              * cos( radians( locations.lng ) - radians(-71.35368) ) 
              + sin( radians(42.290763) ) 
              * sin( radians( locations.lat ) ) ) ) AS distance 
from locations 
where active = 1 
and locations.lat between X1 and X2 
and locations.Long between y1 and y2
having distance < 10 ORDER BY distance;

I know with MS SQL I can build a SQL statement that declares four floats (X1, Y1, X2, Y2) and calculates them before the "main" select statement, like I said, I've no idea if this can be done with MySql. However I'd still be inclined to build the four points in C# and pass them as parameters to the SQL query.

Sorry I can't be more help, if anyone can answer the MySQL & Php specific portions of this, feel free to edit this answer to do so.

22
votes

I needed to solve similar problem (filtering rows by distance from single point) and by combining original question with answers and comments, I came up with solution which perfectly works for me on both MySQL 5.6 and 5.7.

SELECT 
    *,
    (6371 * ACOS(COS(RADIANS(56.946285)) * COS(RADIANS(Y(coordinates))) 
    * COS(RADIANS(X(coordinates)) - RADIANS(24.105078)) + SIN(RADIANS(56.946285))
    * SIN(RADIANS(Y(coordinates))))) AS distance
FROM places
WHERE MBRContains
    (
    LineString
        (
        Point (
            24.105078 + 15 / (111.320 * COS(RADIANS(56.946285))),
            56.946285 + 15 / 111.133
        ),
        Point (
            24.105078 - 15 / (111.320 * COS(RADIANS(56.946285))),
            56.946285 - 15 / 111.133
        )
    ),
    coordinates
    )
HAVING distance < 15
ORDER By distance

coordinates is field with type POINT and has SPATIAL index
6371 is for calculating distance in kilometres
56.946285 is latitude for central point
24.105078 is longitude for central point
15 is maximum distance in kilometers

In my tests, MySQL uses SPATIAL index on coordinates field to quickly select all rows which are within rectangle and then calculates actual distance for all filtered places to exclude places from rectangles corners and leave only places inside circle.

This is visualisation of my result:

map

Gray stars visualise all points on map, yellow stars are ones returned by MySQL query. Gray stars inside corners of rectangle (but outside circle) were selected by MBRContains() and then deselected by HAVING clause.

14
votes

The following MySQL function was posted on this blog post. I haven't tested it much, but from what I gathered from the post, if your latitude and longitude fields are indexed, this may work well for you:

DELIMITER $$

DROP FUNCTION IF EXISTS `get_distance_in_miles_between_geo_locations` $$
CREATE FUNCTION get_distance_in_miles_between_geo_locations(
  geo1_latitude decimal(10,6), geo1_longitude decimal(10,6), 
  geo2_latitude decimal(10,6), geo2_longitude decimal(10,6)) 
returns decimal(10,3) DETERMINISTIC
BEGIN
  return ((ACOS(SIN(geo1_latitude * PI() / 180) * SIN(geo2_latitude * PI() / 180) 
    + COS(geo1_latitude * PI() / 180) * COS(geo2_latitude * PI() / 180) 
    * COS((geo1_longitude - geo2_longitude) * PI() / 180)) * 180 / PI()) 
    * 60 * 1.1515);
END $$

DELIMITER ;

Sample usage:

Assuming a table called places with fields latitude & longitude:

SELECT get_distance_in_miles_between_geo_locations(-34.017330, 22.809500,
latitude, longitude) AS distance_from_input FROM places;
12
votes

if you are using MySQL 5.7.*, then you can use st_distance_sphere(POINT, POINT).

Select st_distance_sphere(POINT(-2.997065, 53.404146 ), POINT(58.615349, 23.56676 ))/1000  as distcance
9
votes
SELECT * FROM (SELECT *,(((acos(sin((43.6980168*pi()/180)) * 
sin((latitude*pi()/180))+cos((43.6980168*pi()/180)) * 
cos((latitude*pi()/180)) * cos(((7.266903899999988- longitude)* 
pi()/180))))*180/pi())*60*1.1515 ) as distance 
FROM wp_users WHERE 1 GROUP BY ID limit 0,10) as X 
ORDER BY ID DESC

This is the distance calculation query between to points in MySQL, I have used it in a long database, it it working perfect! Note: do the changes (database name, table name, column etc) as per your requirements.

8
votes
set @latitude=53.754842;
set @longitude=-2.708077;
set @radius=20;

set @lng_min = @longitude - @radius/abs(cos(radians(@latitude))*69);
set @lng_max = @longitude + @radius/abs(cos(radians(@latitude))*69);
set @lat_min = @latitude - (@radius/69);
set @lat_max = @latitude + (@radius/69);

SELECT * FROM postcode
WHERE (longitude BETWEEN @lng_min AND @lng_max)
AND (latitude BETWEEN @lat_min and @lat_max);

source

7
votes
   select
   (((acos(sin(('$latitude'*pi()/180)) * sin((`lat`*pi()/180))+cos(('$latitude'*pi()/180)) 
    * cos((`lat`*pi()/180)) * cos((('$longitude'- `lng`)*pi()/180))))*180/pi())*60*1.1515) 
    AS distance
    from table having distance<22;
5
votes

A MySQL function which returns the number of metres between the two coordinates:

CREATE FUNCTION DISTANCE_BETWEEN (lat1 DOUBLE, lon1 DOUBLE, lat2 DOUBLE, lon2 DOUBLE)
RETURNS DOUBLE DETERMINISTIC
RETURN ACOS( SIN(lat1*PI()/180)*SIN(lat2*PI()/180) + COS(lat1*PI()/180)*COS(lat2*PI()/180)*COS(lon2*PI()/180-lon1*PI()/180) ) * 6371000

To return the value in a different format, replace the 6371000 in the function with the radius of Earth in your choice of unit. For example, kilometres would be 6371 and miles would be 3959.

To use the function, just call it as you would any other function in MySQL. For example, if you had a table city, you could find the distance between every city to every other city:

SELECT
    `city1`.`name`,
    `city2`.`name`,
    ROUND(DISTANCE_BETWEEN(`city1`.`latitude`, `city1`.`longitude`, `city2`.`latitude`, `city2`.`longitude`)) AS `distance`
FROM
    `city` AS `city1`
JOIN
    `city` AS `city2`
4
votes

The full code with details about how to install as MySQL plugin are here: https://github.com/lucasepe/lib_mysqludf_haversine

I posted this last year as comment. Since kindly @TylerCollier suggested me to post as answer, here it is.

Another way is to write a custom UDF function that returns the haversine distance from two points. This function can take in input:

lat1 (real), lng1 (real), lat2 (real), lng2 (real), type (string - optinal - 'km', 'ft', 'mi')

So we can write something like this:

SELECT id, name FROM MY_PLACES WHERE haversine_distance(lat1, lng1, lat2, lng2) < 40;

to fetch all records with a distance less then 40 kilometers. Or:

SELECT id, name FROM MY_PLACES WHERE haversine_distance(lat1, lng1, lat2, lng2, 'ft') < 25;

to fetch all records with a distance less then 25 feet.

The core function is:

double
haversine_distance( UDF_INIT* initid, UDF_ARGS* args, char* is_null, char *error ) {
    double result = *(double*) initid->ptr;
    /*Earth Radius in Kilometers.*/ 
    double R = 6372.797560856;
    double DEG_TO_RAD = M_PI/180.0;
    double RAD_TO_DEG = 180.0/M_PI;
    double lat1 = *(double*) args->args[0];
    double lon1 = *(double*) args->args[1];
    double lat2 = *(double*) args->args[2];
    double lon2 = *(double*) args->args[3];
    double dlon = (lon2 - lon1) * DEG_TO_RAD;
    double dlat = (lat2 - lat1) * DEG_TO_RAD;
    double a = pow(sin(dlat * 0.5),2) + 
        cos(lat1*DEG_TO_RAD) * cos(lat2*DEG_TO_RAD) * pow(sin(dlon * 0.5),2);
    double c = 2.0 * atan2(sqrt(a), sqrt(1-a));
    result = ( R * c );
    /*
     * If we have a 5th distance type argument...
     */
    if (args->arg_count == 5) {
        str_to_lowercase(args->args[4]);
        if (strcmp(args->args[4], "ft") == 0) result *= 3280.8399;
        if (strcmp(args->args[4], "mi") == 0) result *= 0.621371192;
    }

    return result;
}
3
votes

A fast, simple and accurate (for smaller distances) approximation can be done with a spherical projection. At least in my routing algorithm I get a 20% boost compared to the correct calculation. In Java code it looks like:

public double approxDistKm(double fromLat, double fromLon, double toLat, double toLon) {
    double dLat = Math.toRadians(toLat - fromLat);
    double dLon = Math.toRadians(toLon - fromLon);
    double tmp = Math.cos(Math.toRadians((fromLat + toLat) / 2)) * dLon;
    double d = dLat * dLat + tmp * tmp;
    return R * Math.sqrt(d);
}

Not sure about MySQL (sorry!).

Be sure you know about the limitation (the third param of assertEquals means the accuracy in kilometers):

    float lat = 24.235f;
    float lon = 47.234f;
    CalcDistance dist = new CalcDistance();
    double res = 15.051;
    assertEquals(res, dist.calcDistKm(lat, lon, lat - 0.1, lon + 0.1), 1e-3);
    assertEquals(res, dist.approxDistKm(lat, lon, lat - 0.1, lon + 0.1), 1e-3);

    res = 150.748;
    assertEquals(res, dist.calcDistKm(lat, lon, lat - 1, lon + 1), 1e-3);
    assertEquals(res, dist.approxDistKm(lat, lon, lat - 1, lon + 1), 1e-2);

    res = 1527.919;
    assertEquals(res, dist.calcDistKm(lat, lon, lat - 10, lon + 10), 1e-3);
    assertEquals(res, dist.approxDistKm(lat, lon, lat - 10, lon + 10), 10);
3
votes

Here is a very detailed description of Geo Distance Search with MySQL a solution based on implementation of Haversine Formula to mysql. The complete solution description with theory, implementation and further performance optimization. Although the spatial optimization part didn't work correct in my case. http://www.scribd.com/doc/2569355/Geo-Distance-Search-with-MySQL

3
votes

Have a read of Geo Distance Search with MySQL, a solution based on implementation of Haversine Formula to MySQL. This is a complete solution description with theory, implementation and further performance optimization. Although the spatial optimization part didn't work correctly in my case.

I noticed two mistakes in this:

  1. the use of abs in the select statement on p8. I just omitted abs and it worked.

  2. the spatial search distance function on p27 does not convert to radians or multiply longitude by cos(latitude), unless his spatial data is loaded with this in consideration (cannot tell from context of article), but his example on p26 indicates that his spatial data POINT is not loaded with radians or degrees.

0
votes
$objectQuery = "SELECT table_master.*, ((acos(sin((" . $latitude . "*pi()/180)) * sin((`latitude`*pi()/180))+cos((" . $latitude . "*pi()/180)) * cos((`latitude`*pi()/180)) * cos(((" . $longitude . "- `longtude`)* pi()/180))))*180/pi())*60*1.1515  as distance FROM `table_post_broadcasts` JOIN table_master ON table_post_broadcasts.master_id = table_master.id WHERE table_master.type_of_post ='type' HAVING distance <='" . $Radius . "' ORDER BY distance asc";
0
votes

Using mysql

SET @orig_lon = 1.027125;
SET @dest_lon = 1.027125;

SET @orig_lat = 2.398441;
SET @dest_lat = 2.398441;

SET @kmormiles = 6371;-- for distance in miles set to : 3956

SELECT @kmormiles * ACOS(LEAST(COS(RADIANS(@orig_lat)) * 
 COS(RADIANS(@dest_lat)) * COS(RADIANS(@orig_lon - @dest_lon)) + 
 SIN(RADIANS(@orig_lat)) * SIN(RADIANS(@dest_lat)),1.0)) as distance;

See: https://andrew.hedges.name/experiments/haversine/

See: https://stackoverflow.com/a/24372831/5155484

See: http://www.plumislandmedia.net/mysql/haversine-mysql-nearest-loc/

NOTE: LEAST is used to avoid null values as a comment suggested on https://stackoverflow.com/a/24372831/5155484