3
votes

I need to calculate the shortest distance between two point matrices. I am new to R and have no clue how to do this. This is the code that I used to call in the data and convert them to points

library(dismo)  
laurus <- gbif("Laurus", "nobilis")
locs <- subset(laurus, select = c("country", "lat", "lon"))
#uk observations
locs.uk <-subset(locs, locs$country=="United Kingdom")
#ireland observations
locs.ire <- subset(locs, locs$country=="Ireland")

uk_coord <-SpatialPoints(locs.uk[,c("lon","lat")])
ire_coord <-SpatialPoints(locs.ire[,c("lon","lat")])
crs.geo<-CRS("+proj=longlat +ellps=WGS84 +datum=WGS84")  # geographical, datum WGS84
proj4string(uk_coord) <-crs.geo #define projection
proj4string(ire_coord) <-crs.geo #define projection

I need to calculate the shortest distance (Euclidean) from points in Ireland to points in UK. In other words I need to calculate the distance from each point in Ireland to its closet point in the UK points layer. Can some one tell me what function or package I need to use in order to do this. I looked at gdistance and could not find a function that calculate the shortest distance.

3
Distance along a road and path network, straight line Euclidean distance (okay for small areas of the globe), or geodesic great-circle distance (for anything bigger than an average country). Sounds like an urban setting, so I guess either network or straight line... Also, you should make a reproducible example by giving us some data or generating it with random numbers. - Spacedman
@Spacedman. All the points are in the city of New York so I think Euclidean distance would be fine. Would you want me to upload the dataset so you guys can take a look - rrodrigorn0

3 Answers

5
votes

You can use the FNN package which uses spatial trees to make the search efficient. It works with euclidean geometry, so you should transform your points to a planar coordinate system. I'll use rgdal package to convert to UK grid reference (stretching it a bit to use it over ireland here, but your original data was New York and you should use a New York planar coord system for that):

> require(rgdal)
> uk_coord = spTransform(uk_coord, CRS("+init=epsg:27700"))
> ire_coord = spTransform(ire_coord, CRS("+init=epsg:27700"))

Now we can use FNN:

> require(FNN)
> g = get.knnx(coordinates(uk_coord), coordinates(ire_coord),k=1)
> str(g)
List of 2
 $ nn.index: int [1:69, 1] 202 488 202 488 253 253 488 253 253 253 ...
 $ nn.dist : num [1:69, 1] 232352 325375 87325 251770 203863 ...

g is a list of indexes and distances of the uk points that are nearest to the 69 irish points. The distances are in metres because the coordinate system is in metres.

You can illustrate this by plotting the points then joining irish point 1 to uk point 202, irish 2 to uk 488, irish 3 to uk 202 etc. In code:

> plot(uk_coord, col=2, xlim=c(-1e5,6e5))
> plot(ire_coord, add=TRUE)
> segments(coordinates(ire_coord)[,1], coordinates(ire_coord)[,2], coordinates(uk_coord[g$nn.index[,1]])[,1], coordinates(uk_coord[g$nn.index[,1]])[,2])

nearest neighbours

1
votes

gDistance() from the rgeos package will give you the distance matrix

library(rgeos)
gDistance(uk_coord, ire_coord, byid = TRUE)

Another option is nncross() from the spatstat package. Pro: it gives the distance to the nearest neighbour. Contra: you 'll need to convert the SpatialPoints to a SpatialPointPattern (see ?as.ppp in statstat)

library(spatstat)
nncros(uk.ppp, ire.ppp)
0
votes

The package geosphere offers a lot of dist* functions to evaluate distances from two lat/lon points. In your example, you could try:

 require(geosphere)
 #get the coordinates of UK and Ireland
 pointuk<-uk_coord@coords
 pointire<-ire_coord@coords
 #prepare a vector which will contain the minimum distance for each Ireland point
 res<-numeric(nrow(pointire))
 #get the min distance
 for (i in 1:length(res)) res[i]<-min(distHaversine(pointire[i,,drop=FALSE],pointuk))

The distances you'll obtain are in meters (you can change by setting the radius of the earth in the call to distHaversine).

The problem with gDistance and other rgeos functions is that they evaluate the distance as the coordinates were planar. Basically, the number you obtain is not much useful.