1
votes

I have 2 dataframes that are simply matrices of 2 dimensions (lat/long). Both dataframes would look like the input below:

latitude  longitude
27.78833  -82.28197
27.79667  -82.29294

Let's call them "dfref" and "dfnew". I would like to find the nearest point in dfnew for each point in dfref and the distance between the 2 points in meters.

The output would look like this:

dr.latitude  dr.longitude  dn.latitude  dn.longitude  dist
27.78833     -82.28197     27.54345     -82.33233     162.34
27.79667     -82.29294     27.56543     -82.12323     232.23

I have tried using the knn function in the class package and the Searchtrees package but my script only found the nearest points in the dfref matrix and I am not sure how to add the measurement.

knn1(train=cbind(dfref), test=cbind(dfnew), cl=seq_len(nrow(dfnew))) 

Is there a function that does both efficiently and how can I get this into one script?

1

1 Answers

0
votes

I am not expert on Geo math, but it seems that you can start with something like this:

dfref <- read.table(text = 
"latitude  longitude
27.78833  -82.28197
27.79667  -82.29294", header = T)
dtref <- data.table(dfref)

dfnew <- read.table(text = 
"latitude  longitude
27.54345     -82.33233", header = T)
dtnew <- data.table(dfnew)

# Make cartesian product of to tables.
dtref$fake <- 1
dtnew$fake <- 1
dtall <- merge(dtref, dtnew, by = "fake", allow.cartesian = T)

# Calculate distance.
library(geosphere)
dtall[, distance := distVincentyEllipsoid(c(longitude.x, latitude.x), c(longitude.y, latitude.y)), by = 1:nrow(dtall)]

# Print results.
dtall[, .(latitude.x, longitude.x, latitude.y, longitude.y, distance)]

#      latitude.x longitude.x latitude.y longitude.y distance
# 1:   27.78833   -82.28197   27.54345   -82.33233 27587.29
# 2:   27.79667   -82.29294   27.54345   -82.33233 28328.19