How to calculate mean , min, and max across when grouping using dplyr?

Question

So I have a data frame , simplified as this:

Simply put, I want to calculate the following for each row:

Mean
Median
Max
Min

Easy enough, but the hard part for me is after taking each, how do I create a mean value to represent each ID.

So after I get these values, how do I show the AVERAGE MEAN/MED/MAX/MIN for each ID???

Expected output:

(1)

ID  Mean Median Min Max
1      2      1   0   5
2      2      3   0   3
3      1      1   0   2
2      5      5   1   9
3   3.66      3   3   5
1      4      4   2   6

(2)

ID  AvgMean AvgMedian AvgMin AvgMax
1         3       2.5      1    5.5  
2       3.5         4      1      6 
3      2.33         3      3    3.5

The way you describe this problem, it sounds like you are trying to take the mean after taking the mean for each row... which is sort of a weird statistic? — hachiko

s__ s__ · Accepted Answer · 2020-11-10T19:48:14

You can try something like this:

   library(dplyr)
   df %>% 
   group_by(ID) %>%
   summarise(mean_ = mean(c_across(A:C), na.rm = T),
             medi_ = median(c_across(A:C), na.rm = T),
             max_  = max(c_across(A:C), na.rm = T),
             min_  = min(c_across(A:C), na.rm = T))
    
    `summarise()` ungrouping output (override with `.groups` argument)
    # A tibble: 3 x 5
         ID mean_ medi_  max_  min_
      <int> <dbl> <dbl> <int> <int>
    1     1  3      3       6     0
    2     2  3.5    3       9     0
    3     3  2.33   2.5     5     0

For the second part:

df %>% 
   rowwise() %>%
   summarise(mean_ = mean(c_across(A:C), na.rm = T),
             medi_ = median(c_across(A:C), na.rm = T),
             max_  = max(c_across(A:C), na.rm = T),
             min_  = min(c_across(A:C), na.rm = T))

`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 6 x 4
  mean_ medi_  max_  min_
  <dbl> <int> <int> <int>
1  2        1     5     0
2  2        3     3     0
3  1        1     2     0
4  5        5     9     1
5  3.67     3     5     3
6  4        4     6     2

With data:

df <- structure(list(ID = c(1L, 2L, 3L, 2L, 3L, 1L), A = c(1L, 3L, 
0L, 5L, 3L, 2L), B = c(5L, 0L, 2L, 9L, 5L, 6L), C = c(0L, 3L, 
1L, 1L, 3L, 4L)), class = "data.frame", row.names = c(NA, -6L
))

How to calculate mean , min, and max across when grouping using dplyr?

3 Answers