I have a toy dataframe like the following
Date Type Units
2016-10-11 A 11
2016-10-12 B 14
2016-10-12 C 10
2016-10-13 A 6
2016-10-13 B 4
2016-10-13 D 9
2016-10-14 E 7
2016-10-14 A 12
2016-10-14 C 12
2016-10-15 A 13
2016-10-15 F 12
2016-10-15 C 3
2016-10-15 D 4
df <- structure(list(Date = c("2016-10-11", "2016-10-12", "2016-10-12",
"2016-10-13", "2016-10-13", "2016-10-13", "2016-10-14", "2016-10-14",
"2016-10-14", "2016-10-15", "2016-10-15", "2016-10-15", "2016-10-15"
), Type = c("A", "B", "C", "A", "B", "D", "E", "A", "C", "A",
"F", "C", "D"), Units = c(11L, 14L, 10L, 6L, 4L, 9L, 7L, 12L,
12L, 13L, 12L, 3L, 4L)), class = "data.frame", row.names = c(NA,
-13L))
and I would like to: add a column which indicates the number of types within each Date
, AND sum the Units
column grouping by Date
.
The output dataset would be something like the following:
Date Units n_types
<chr> <int> <dbl>
2016-10-11 11 1
2016-10-12 24 2
2016-10-13 19 3
2016-10-14 31 3
2016-10-15 32 4
However, I didn't manage do it if not with two mutate
functions, as in the code below:
df %>%
group_by(Date) %>%
mutate(n_types = n()) %>%
summarise_if(is.numeric, sum) %>%
mutate(n_types = sqrt(n_types)) %>%
ungroup()
Note: I used summarise_if
because in my original dataset I have many more numeric variables rather than just Units
, so I must use this function. Is there another way to add the n_types
column without using two mutate
functions? Or is mine a good way to do it?
df %>% group_by(Date) %>% mutate(n_types = n()) %>% summarise(Units = sum(Units), n_types = sqrt(sum(n_types)))
– akrunmutate
functions. – Ric S