2
votes

I have a known point (centre) with known latitude and longitude and I have some coordinates (in lat/lon) in a df and I want to see which ones are within a radius of 5km or less from the centre point.

Centre

mylon <- c(-2.106472)
mylat <- c(57.14455)

Coordinates

longitude latitude
 [1,] -2.141177 57.16278
 [2,] -2.090960 57.18079
 [3,] -2.118894 57.12292
 [4,] -2.140090 57.13763
 [5,] -2.113988 57.13855
 [6,] -2.123892 57.13585
 [7,] -2.144685 57.17207
 [8,] -2.220046 57.19150
 [9,] -2.114343 57.09301
[10,] -2.285314 57.20138
[11,] -2.092354 57.14279
[12,] -2.149571 57.15334
[13,] -2.126605 57.15615
[14,] -2.097045 57.10443
[15,] -2.183441 57.15051
[16,] -2.166915 57.15089
[17,] -2.133863 57.15201
[18,] -2.100909 57.18968
[19,] -2.106770 57.15670
[20,] -2.251495 57.19315
[21,] -2.118894 57.12292
[22,] -2.140090 57.13763
[23,] -2.123201 57.12686
[24,] -2.114343 57.09301
[25,] -2.140327 57.15676
[26,] -2.148826 57.17355
[27,] -2.120553 57.12507
[28,] -2.133902 57.16279
[29,] -2.094246 57.17180
[30,] -2.113170 57.14125
[31,] -2.251495 57.19315
[32,] -2.090960 57.18079
[33,] -2.212955 57.10941
[34,] -2.118894 57.12292
[35,] -2.183501 57.19596
[36,] -2.140090 57.13763
[37,] -2.249217 57.10063
[38,] -2.123201 57.12686
[39,] -2.114343 57.09301
[40,] -2.092354 57.14279
[41,] -2.148826 57.17355
[42,] -2.120553 57.12507
[43,] -2.117338 57.15301
[44,] -2.116486 57.14484
[45,] -2.094981 57.13614
[46,] -2.232998 57.14629
[47,] -2.118894 57.12292
[48,] -2.140090 57.13763
[49,] -2.123201 57.12686
[50,] -2.104092 57.14485
[51,] -2.114343 57.09301
[52,] -2.148826 57.17355
[53,] -2.175179 57.15079
[54,] -2.090713 57.14755
[55,] -2.090960 57.18079
[56,] -2.118894 57.12292
[57,] -2.140090 57.13763

I would appreciate any help. Thanks a lot

3
geosphere::distHaversine(coord, c(-2.106472, 57.14455)) / 1000 < 5alistaire
I think these points are in degrees, and the Haversine formula takes points in radians as an argument. So don't we need to first convert from degrees to radians?Rich Pauloo
@RichPauloo No, it takes longitudes and latitudes, which are pretty much always expressed in degrees. Map projections are an issue, but that's beyond the scope of the question.alistaire
BOth answers are nice in their own way. I will go with the shortest which is one line of code. CheersV.MAT

3 Answers

4
votes

This can be done with just the distance formula and conversion, but once the distances start to get bigger, accuracy will get very bad. It's possible to code your own more rigorous approach, but it's much easier to use the geosphere package, which offers various functions for determining geospatial distance with varying degrees of accuracy and computational power required. distHaversine is a good starting point:

coord <- cbind("longitude" = c(-2.141177, -2.09096, -2.118894, -2.14009, -2.113988, -2.123892, -2.144685, -2.220046, -2.114343, -2.285314, -2.092354, -2.149571, -2.126605, -2.097045, -2.183441, -2.166915, -2.133863, -2.100909, -2.10677, -2.251495, -2.118894, -2.14009, -2.123201, -2.114343, -2.140327, -2.148826, -2.120553, -2.133902, -2.094246, -2.11317, -2.251495, -2.09096, -2.212955, -2.118894, -2.183501, -2.14009, -2.249217, -2.123201, -2.114343, -2.092354, -2.148826, -2.120553, -2.117338, -2.116486, -2.094981, -2.232998, -2.118894, -2.14009, -2.123201, -2.104092, -2.114343, -2.148826, -2.175179, -2.090713, -2.09096, -2.118894, -2.14009), 
               "latitude" = c(57.16278, 57.18079, 57.12292, 57.13763, 57.13855, 57.13585, 57.17207, 57.1915, 57.09301, 57.20138, 57.14279, 57.15334, 57.15615, 57.10443, 57.15051, 57.15089, 57.15201, 57.18968, 57.1567, 57.19315, 57.12292, 57.13763, 57.12686, 57.09301, 57.15676, 57.17355, 57.12507, 57.16279, 57.1718, 57.14125, 57.19315, 57.18079, 57.10941, 57.12292, 57.19596, 57.13763, 57.10063, 57.12686, 57.09301, 57.14279, 57.17355, 57.12507, 57.15301, 57.14484, 57.13614, 57.14629, 57.12292, 57.13763, 57.12686, 57.14485, 57.09301, 57.17355, 57.15079, 57.14755, 57.18079, 57.12292, 57.13763))

str(coord)    # like data above, a matrix, not a data.frame
#>  num [1:57, 1:2] -2.14 -2.09 -2.12 -2.14 -2.11 ...
#>  - attr(*, "dimnames")=List of 2
#>   ..$ : NULL
#>   ..$ : chr [1:2] "longitude" "latitude"

# make a data.frame to hold both numeric and logical values
coord_df <- data.frame(coord, 
                       within_5km = geosphere::distHaversine(
                           coord, 
                           c(-2.106472, 57.14455)
                       ) / 1000 < 5)    # convert m to km, check < 5
str(coord_df)
#> 'data.frame':    57 obs. of  3 variables:
#>  $ longitude : num  -2.14 -2.09 -2.12 -2.14 -2.11 ...
#>  $ latitude  : num  57.2 57.2 57.1 57.1 57.1 ...
#>  $ within_5km: logi  TRUE TRUE TRUE TRUE TRUE TRUE ...

table(coord_df$within_5km)
#> 
#> FALSE  TRUE 
#>    13    44
6
votes

We can use the st_distance function from the sf package to achieve this task. sf is the next generation data type in R to handle spatial data. The following example shows how to create two sf object, point_sf: the reference, and target_sf: the target points for examination. After that, it is easy to apply the st_distance function. target_sf2 is the final output, which contains 44 records that are within 5 km of the reference point.

library(tidyverse)
library(sf)

## Create reference point data
point <- data_frame(mylon = -2.106472, mylat = 57.14455) 
# Specify the source of X and Y coordinates
point_sf <- st_as_sf(point, coords = c("mylon", "mylat"))
# Set the projection to EPSG 4326 (long-lat)
st_crs(point_sf) <- 4326

## Create target point data
target <- read.table(text = "longitude latitude
-2.141177 57.16278
-2.090960 57.18079
-2.118894 57.12292
-2.140090 57.13763
-2.113988 57.13855
-2.123892 57.13585
-2.144685 57.17207
-2.220046 57.19150
-2.114343 57.09301
-2.285314 57.20138
-2.092354 57.14279
-2.149571 57.15334
-2.126605 57.15615
-2.097045 57.10443
-2.183441 57.15051
-2.166915 57.15089
-2.133863 57.15201
-2.100909 57.18968
-2.106770 57.15670
-2.251495 57.19315
-2.118894 57.12292
-2.140090 57.13763
-2.123201 57.12686
-2.114343 57.09301
-2.140327 57.15676
-2.148826 57.17355
-2.120553 57.12507
-2.133902 57.16279
-2.094246 57.17180
-2.113170 57.14125
-2.251495 57.19315
-2.090960 57.18079
-2.212955 57.10941
-2.118894 57.12292
-2.183501 57.19596
-2.140090 57.13763
-2.249217 57.10063
-2.123201 57.12686
-2.114343 57.09301
-2.092354 57.14279
-2.148826 57.17355
-2.120553 57.12507
-2.117338 57.15301
-2.116486 57.14484
-2.094981 57.13614
-2.232998 57.14629
-2.118894 57.12292
-2.140090 57.13763
-2.123201 57.12686
-2.104092 57.14485
-2.114343 57.09301
-2.148826 57.17355
-2.175179 57.15079
-2.090713 57.14755
-2.090960 57.18079
-2.118894 57.12292
-2.140090 57.13763",
                     header = TRUE, stringsAsFactors = FALSE)

# Specify the source of X and Y coordinates
target_sf <- st_as_sf(target, coords = c("longitude", "latitude"))
# Set the projection to EPSG 4326 (long-lat)
st_crs(target_sf) <- 4326

# Calculate the distance
target_sf2 <- target_sf %>%
  mutate(Dist = as.numeric(st_distance(point_sf, target_sf, by_element = TRUE))) %>%
  # Filter the records with Dist <= 5000 (m)
  filter(Dist <= 5000)
0
votes

On this link on the Google Maps API help you can find some explanation and some SQL code to look up something within a range of n miles.

All you have to do to use it within 5 km is to convert km to miles. So you will end up with:

SELECT id, 
    ( 3959 * acos( cos( radians(37) ) * cos( radians( lat ) ) * cos( radians( lng ) - radians(-122) ) + sin( radians(37) ) * sin( radians( lat ) ) ) ) AS distance 
FROM markers 
HAVING distance < (5 /*km*/ / 0.621371192 /*convert to miles*/) 
ORDER BY distance