I have the following tibble:
df <- structure(list(treatment = c("control", "control", "control",
"control", "control", "control", "treated", "treated", "treated",
"treated", "treated", "treated"), `0610005C13Rik` = c(5L, 2L,
2L, 5L, 1L, 0L, 6L, 1L, 0L, 5L, 1L, 2L), `0610007P14Rik` = c(300L,
249L, 166L, 104L, 248L, 136L, 164L, 121L, 191L, 187L, 289L, 169L
), `0610009B22Rik` = c(251L, 158L, 92L, 82L, 239L, 107L, 147L,
96L, 153L, 200L, 211L, 80L), `0610009L18Rik` = c(42L, 17L, 16L,
17L, 10L, 6L, 18L, 1L, 15L, 8L, 19L, 13L), `0610009O20Rik` = c(187L,
77L, 86L, 37L, 81L, 24L, 83L, 57L, 98L, 83L, 113L, 48L), `0610010B08Rik` = c(16L,
3L, 6L, 3L, 2L, 3L, 3L, 2L, 3L, 2L, 3L, 1L)), .Names = c("treatment",
"0610005C13Rik", "0610007P14Rik", "0610009B22Rik", "0610009L18Rik",
"0610009O20Rik", "0610010B08Rik"), row.names = c(NA, -12L), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), vars = "treatment", drop = TRUE, indices = list(
0:5, 6:11), group_sizes = c(6L, 6L), biggest_group_size = 6L, labels = structure(list(
treatment = c("control", "treated")), row.names = c(NA, -2L
), class = "data.frame", vars = "treatment", drop = TRUE, .Names = "treatment"))
That looks like this:
Source: local data frame [12 x 7]
Groups: treatment [2]
treatment `0610005C13Rik` `0610007P14Rik` `0610009B22Rik` `0610009L18Rik` `0610009O20Rik` `0610010B08Rik`
<chr> <int> <int> <int> <int> <int> <int>
1 control 5 300 251 42 187 16
2 control 2 249 158 17 77 3
3 control 2 166 92 16 86 6
4 control 5 104 82 17 37 3
5 control 1 248 239 10 81 2
6 control 0 136 107 6 24 3
7 treated 6 164 147 18 83 3
8 treated 1 121 96 1 57 2
9 treated 0 191 153 15 98 3
10 treated 5 187 200 8 83 2
11 treated 1 289 211 19 113 3
12 treated 2 169 80 13 48 1
What I want to do is to calculate mean
and coefficient variation (cv) based on grouped treatment
. The CV is basically mean / sd
sd / mean
. The final expected result looks like this:
gene_symbol control.mean treated.mean control.cv treated.cv
0610005C13Rik 2.5000 2.500000 0.829457 ...
0610007P14Rik 200.5000 186.833333 ... ...
... etc ...
How can I do that using dplyr?