0
votes

I have a toy dataframe like the following

      Date Type Units
2016-10-11    A    11
2016-10-12    B    14
2016-10-12    C    10
2016-10-13    A     6
2016-10-13    B     4
2016-10-13    D     9
2016-10-14    E     7
2016-10-14    A    12
2016-10-14    C    12
2016-10-15    A    13
2016-10-15    F    12
2016-10-15    C     3
2016-10-15    D     4

df <- structure(list(Date = c("2016-10-11", "2016-10-12", "2016-10-12", 
"2016-10-13", "2016-10-13", "2016-10-13", "2016-10-14", "2016-10-14", 
"2016-10-14", "2016-10-15", "2016-10-15", "2016-10-15", "2016-10-15"
), Type = c("A", "B", "C", "A", "B", "D", "E", "A", "C", "A", 
"F", "C", "D"), Units = c(11L, 14L, 10L, 6L, 4L, 9L, 7L, 12L, 
12L, 13L, 12L, 3L, 4L)), class = "data.frame", row.names = c(NA, 
-13L))

and I would like to: add a column which indicates the number of types within each Date, AND sum the Units column grouping by Date. The output dataset would be something like the following:

Date       Units n_types
<chr>      <int>   <dbl>
2016-10-11    11       1
2016-10-12    24       2
2016-10-13    19       3
2016-10-14    31       3
2016-10-15    32       4

However, I didn't manage do it if not with two mutate functions, as in the code below:

df %>%
  group_by(Date) %>%
  mutate(n_types = n()) %>%
  summarise_if(is.numeric, sum) %>%
  mutate(n_types = sqrt(n_types)) %>%
  ungroup()

Note: I used summarise_if because in my original dataset I have many more numeric variables rather than just Units, so I must use this function. Is there another way to add the n_types column without using two mutate functions? Or is mine a good way to do it?

1
Sample data returns expected output or am I missing something?NelsonGon
I think you have additional requirement in different columns. If you already know the columns to sum and sqrt, then df %>% group_by(Date) %>% mutate(n_types = n()) %>% summarise(Units = sum(Units), n_types = sqrt(sum(n_types)))akrun
I know that my code works, I was asking if there is another way to do it without using two mutate functions.Ric S

1 Answers

2
votes

We can place the n_types also in the group_by and then do the summarise_if to remove one additional step

df %>% 
   group_by(Date) %>% 
   group_by(n_types = n(), add = TRUE) %>% 
   summarise_if(is.numeric, sum)
# A tibble: 5 x 3
# Groups:   Date [?]
#  Date       n_types Units
#  <chr>        <int> <int>
#1 2016-10-11       1    11
#2 2016-10-12       2    24
#3 2016-10-13       3    19
#4 2016-10-14       3    31
#5 2016-10-15       4    32