I have a large dataset (df) (with 300,000 houses) which I have the longitude and latitude for each observation. Below (df1) is the first 10 observations of the data:
df1 <- read.table(sep=",", col.names=c("lat", "lon"), text="
53.543526,-8.047727
51.88029, -9.583830
52.06056, -9.488551
51.87087, -9.577604
51.89530, -8.454321
51.95688, -7.851760
53.37621, -6.392430
53.37719, -6.234660
51.88029, -9.583830
51.88145, -9.600894")
Firstly, I tried to compare my dataset (all 300,000 observations) to one data point using the below (Calculate distance between two long lat coordinates in a dataframe):
centre = c(53.543526, -8.089727)
distHaversine(df, centre)
# and
distm(df, centre, fun = distHaversine)
But I kept getting the error:
Error in .pointsToMatrix(x) : latitude < -90
I have two questions:
How do I calculate the distance from each of my 300,000 observations in dataframe 'df' to the 'centre' datapoint
Say I want to calculate the distance of each house to a list of schools (a smaller yet large dataset - in the hundreds) (for example df2 below). How do I calculate the distance of each house to each school, and then keep the minimum distance?
Example school dataset:
df2 <- read.table(sep=",", col.names=c("lat", "lon"), text="
53.38271, -6.437433
53.34874, -6.131537
53.34449, -6.266856
53.34424, -6.267444
53.34648, -6.261414
53.64333, -8.208663")
Thanks in advance!
distm
, for example, uses long-lat format – Felipe Alvarenga