2
votes

I have a data set that includes the latitude and longitude for each individual property address. As well, I have created two new columns (icelat, icelog) that include the latitude and longitude for one specific building in the state of Utah.

The data looks like this:

                               RowID PropertyAddressLatitude PropertyAddressLongitude  icelat    icelog
1: 000D655E-1AEA-E811-80C3-3863BB430E3F                38.65195                -109.4085 40.2351 -111.6384
2: 000F655E-1AEA-E811-80C3-3863BB430E3F                38.50952                -109.4763 40.2351 -111.6384
3: 0012CB31-D004-E911-80C7-3863BB43E813                      NA                       NA 40.2351 -111.6384
4: 0013655E-1AEA-E811-80C3-3863BB430E3F                38.54184                -109.5031 40.2351 -111.6384
5: 0014655E-1AEA-E811-80C3-3863BB430E3F                      NA                       NA 40.2351 -111.6384
6: 0015655E-1AEA-E811-80C3-3863BB430E3F                      NA                       NA 40.2351 -111.6384

I would like to create a new column, called 'distance' that is the distance, in miles, from each property's latitude and longitude to the specific building in Utah.

I have tried several different methods of using the Geosphere package but am unable to get it to run through all of the 'PropertyAddressLatitude' and 'PropertyAddressLongitude' observations and automatically do the math against 'icelat' and 'icelog.'

1

1 Answers

2
votes

The default units will be in meters, so I'll convert in-place.

meter2mile <- 0.000621371
dat[, distance := meter2mile * geosphere::distVincentyEllipsoid(
        cbind(PropertyAddressLongitude, PropertyAddressLatitude),
        cbind(icelog, icelat)) ]
dat
#                                   RowID PropertyAddressLatitude PropertyAddressLongitude  icelat    icelog distance
# 1: 000D655E-1AEA-E811-80C3-3863BB430E3F                38.65195                -109.4085 40.2351 -111.6384 161.7148
# 2: 000F655E-1AEA-E811-80C3-3863BB430E3F                38.50952                -109.4763 40.2351 -111.6384 166.0397
# 3: 0012CB31-D004-E911-80C7-3863BB43E813                      NA                       NA 40.2351 -111.6384       NA
# 4: 0013655E-1AEA-E811-80C3-3863BB430E3F                38.54184                -109.5031 40.2351 -111.6384 163.4240
# 5: 0014655E-1AEA-E811-80C3-3863BB430E3F                      NA                       NA 40.2351 -111.6384       NA
# 6: 0015655E-1AEA-E811-80C3-3863BB430E3F                      NA                       NA 40.2351 -111.6384       NA

Data

dat <- as.data.table(structure(list(RowID = c("000D655E-1AEA-E811-80C3-3863BB430E3F", "000F655E-1AEA-E811-80C3-3863BB430E3F", "0012CB31-D004-E911-80C7-3863BB43E813", "0013655E-1AEA-E811-80C3-3863BB430E3F", "0014655E-1AEA-E811-80C3-3863BB430E3F", "0015655E-1AEA-E811-80C3-3863BB430E3F"), PropertyAddressLatitude = c(38.65195, 38.50952, NA, 38.54184, NA, NA), PropertyAddressLongitude = c(-109.4085, -109.4763, NA, -109.5031, NA, NA), icelat = c(40.2351, 40.2351, 40.2351, 40.2351, 40.2351, 40.2351), icelog = c(-111.6384, -111.6384, -111.6384, -111.6384, -111.6384, -111.6384)), row.names = c(NA, -6L), class = c("data.table", "data.frame")))

(I inferred data.table from your sample data, please advise if this was incorrect.)