I was wondering what the most efficient method of calculating the distance in miles between two US zipcode columns would be using R.
I have heard of the geosphere package for computing the difference between zipcodes but do not fully understand it and was wondering if there were alternative methods as well.
For example say I have a data frame that looks like this.
ZIP_START ZIP_END
95051 98053
94534 94128
60193 60666
94591 73344
94128 94128
94015 73344
94553 94128
10994 7105
95008 94128
I want to create a new data frame that looks like this.
ZIP_START ZIP_END MILES_DIFFERENCE
95051 98053 x
94534 94128 x
60193 60666 x
94591 73344 x
94128 94128 x
94015 73344 x
94553 94128 x
10994 7105 x
95008 94128 x
Where x is the difference in miles between the two zipcodes.
What is the best method of calculating this distance?
Here is the R code to create the example data frame.
df <- data.frame("ZIP_START" = c(95051, 94534, 60193, 94591, 94128, 94015, 94553, 10994, 95008), "ZIP_END" = c(98053, 94128, 60666, 73344, 94128, 73344, 94128, 7105, 94128))
Please let me know if you have any questions.
Any advice is appreciated.
Thank you for your help.
"I have heard of the geosphere package for computing the difference between zipcodes"
, what examples have you seen that does this, what have you tried, and what isn't working? Questions on SO which appear to be simply asking for someone to do your work don't get a lot of attention (and get down-voted). SO is for asking for programming help, on a program you have written. – SymbolixAUzipcode
package (with latitude & longitude for every zip code), you should try to understand thedistHaversine
method ingeosphere
. It isn't very complicated - here's a code example. – neilfws