I have data of biological compounds levels of test patients, who are grouped into different groups depending on being administered certain drugs. That is, we have:
- Columns: Drugs(or groups) A, B and C, where each group has 3 patients (individually denoted where the patients in A are denoted A1, A2, A3; patients in B are denoted B1, B2, B3, and so on.)
- Rows: we are monitoring biological compounds
Coronin
,Dystrophin
,Tubulin
(randomly Googled protein names), and so on.
So we have a tibble
like (all values in the tibble
are floats):
| compound | A1 | A2 | A3 | B1 ... C3|
|-----------|----|----|----|---- ... --|
| Coronin |
| Dystrophin|
| Gloverin |
| keratin |
| Tubulin |
For each compound, I wish to compute the means of each group, as a new column, like so:
| compound | A1 | A2 | A3 | B1 ...C3| mean_A | mean_B | mean_C |
|-----------|-----|-----|-----|---- ... --|---------|---------|---------|
| Coronin | 1 | 2 | 3 | ... | 2 | ... |
| Dystrophin| 4 | 5 | 6 | ... | 5 | ... |
| Gloverin | ...
| keratin |
| Tubulin |
The code to do this is:
my_tibble <- my_tibble %>%
mutate(mean_A = rowMeans(select(., c("A1", "A2", "A3")))) %>%
mutate(mean_B = rowMeans(select(., c("B1", "B2", "B3")))) %>%
mutate(mean_C = rowMeans(select(., c("C1", "C2", "C3"))))
The question is: I'd like to be able to this for a dynamically input number of groups, i.e. C, D, E, etc ...where column-to-group is a separate, user-input tibble in itself, say:
| group_name | name1 | name2 | name3 |
|------------|-------|-------|-------|
| A | A1 | B2 | C3 |
| B | B1 | B2 | C3 |
...
and so on
How might I iteratively add mutate
verbs, according to a user-specified number of groups (and associated sample-to-group names)?
Note: the group names "C", "B" ...etc are arbitrary (the groups are, for instance, likely to be assigned the name of the drug that that group was given), so I wouldn't use an iterative operation that relies on the fact that they are literally named "A", "B", etc.
dat %>% group_by(compound, grp) %>% summarise(meanval=mean(value))
. Otherwise you'll end up with gibberish code trying to subset groups of columns when you essentially have one value column split by multiple grouping columns. - thelatemail