1
votes

I have x,y coordinates and the "group" (county) in which each is located. For each county, I want to know the minimum, maximum, and mean distance between the points in the county. I then want to tag each point with its county's min, max, mean distance. Getting min, max, and mean distance over all obs is easy -- but I can't figure out how to get it by county. Here is what I'm using as a test for min:

county <- as.integer(c(1, 1, 1, 2, 2, 2))
x <- c(1.0, 2.0, 5.0, 10., 20., 50.)
y <- c(1.0, 3.0, 4.0, 10., 30., 40.)
xy <- data.frame(county,x,y)
xy$mindist <- min(dist(cbind(xy$x, xy$y)))

The min, max, mean for County 1 is 2.2, 5, and 3.5. The min, max, mean for County 2 is 22.4, 50, and 34.7. The code above tags every point with the global minimum (2.2) rather than tagging all count 1 points with 2.2 and all County 2 points with 22.4. I've tried modifying it by grouping, and using by statements, and aggregate....

Any thoughts?

1

1 Answers

2
votes

You can do grouped calculations easily with the dplyr package. One way is to do the following

xy %>% group_by(county) %>% 
       summarise(mind = min(dist(cbind(x,y))), 
                 meand = mean(dist(cbind(x,y))), 
                 maxd= max(dist(cbind(x,y))))

which yields

# A tibble: 2 x 4
  county      mind     meand  maxd
   <int>     <dbl>     <dbl> <dbl>
1      1  2.236068  3.466115     5
2      2 22.360680 34.661152    50

You could also gather the data together first to reduce the number of cbind calls.