Getting the centroids of Lat and Longitude in a data frame

Question

I have a dataframe (df) which has three column likes so: (all numbers random)

ID  Lat    Lon
1   25.32 -63.32
1   25.29 -64.21
1   24.12 -62.43
2   12.42  54.64
2   12.11  53.43
.   ....   ....

Basically I wanted to have the centroid per ID like so:

ID  Lat    Lon    Cent_lat   Cent_lon
1   25.32 -63.32  25.31      -63.25
1   25.29 -64.21  25.31      -63.25
1   24.12 -62.43  25.31      -63.25
2   12.42  54.64  12.20       53.60
2   12.11  53.43  12.20       53.60

I tired the following:

library(geosphere)
library(rgeos)
library(dplyr)

df1 <- by(df,df$ID,centroid(df$Lat, df$Long))

But this gave me this error:

Error in (function (classes, fdef, mtable): unable to find an inherited method for function ‘centroid’ for signature ‘"numeric"’

I even tired

df1 <- by(df,df$ID,centroid(as.numeric(df$Lat), as.numeric(df$Long)))

But this gave me this error:

Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘centroid’ for signature ‘"function"’

Isn't the centroid of three points the average of their components (mean(long), mean(lat))? — lmo
We have more than three points for most cases, and the average method would work if earth was flat :-) — Anubhav Dikshit
To use centroid you need a poligon as matrix object, or a dataframe with appropriate rownames for each point — Robert

hrbrmstr hrbrmstr · Accepted Answer · 2016-08-01T14:01:29

library(geosphere)
library(ggplot2)
library(dplyr)

states <- map_data("state")

head(states)
##        long      lat group order  region subregion
## 1 -87.46201 30.38968     1     1 alabama      <NA>
## 2 -87.48493 30.37249     1     2 alabama      <NA>
## 3 -87.52503 30.37249     1     3 alabama      <NA>
## 4 -87.53076 30.33239     1     4 alabama      <NA>
## 5 -87.57087 30.32665     1     5 alabama      <NA>
## 6 -87.58806 30.32665     1     6 alabama      <NA>

cntrd <- function(x) {
  data.frame(centroid(as.matrix(x[,c("long", "lat")])))
}

by(states, states$group, cntrd) %>% head()
## $`1`
##         lon      lat
## 1 -86.82976 32.82735
## 
## $`2`
##         lon      lat
## 1 -111.6698 34.34309
## 
## $`3`
##         lon      lat
## 1 -92.43826 34.92167
## 
## $`4`
##         lon      lat
## 1 -119.6713 37.40289
## 
## $`5`
##         lon      lat
## 1 -105.5526 39.02653
## 
## $`6`
##         lon      lat
## 1 -72.72553 41.62706

group_by(states, group) %>%
  do(cntrd(.))
## Source: local data frame [63 x 3]
## Groups: group [63]
## 
##    group        lon      lat
##    <dbl>      <dbl>    <dbl>
## 1      1  -86.82976 32.82735
## 2      2 -111.66978 34.34309
## 3      3  -92.43826 34.92167
## 4      4 -119.67130 37.40289
## 5      5 -105.55264 39.02653
## 6      6  -72.72553 41.62706
## 7      7  -75.51543 39.00879
## 8      8  -77.03411 38.91083
## 9      9  -82.51260 28.69498
## 10    10  -83.46361 32.67562
## # ... with 53 more rows

Getting the centroids of Lat and Longitude in a data frame

5 Answers