My data dat
is like this
set.seed(123)
dat<- data.frame(
comp = rep(1:4,2),
grp = rep(c('A','B'), each=4),
pval = runif(8, min=0, max=0.1) )
dat$pval[sample(nrow(dat), 1)] <- NA
pval column contains a list of p values from multiple ttest within each large group.
Now I need to apply the base r function p.adjust to adjust the p values within each group (A,B,...)
what I did was:
dat %>%
group_by(grp) %>%
mutate(pval.adj = p.adjust (pval, method='BH'))
Below is the output of the above code:
comp grp pval pval.adj
1 A 0.02875775 0.08179538
2 A 0.07883051 0.08830174
3 A 0.04089769 0.08179538
4 A 0.08830174 0.08830174
1 B NA NA
2 B 0.00455565 0.01366695
3 B 0.05281055 0.07921582
4 B 0.08924190 0.08924190
The result does not make sense. The last entry of each group, pval and pval.adj are equal. Some pval.adj are much closer to pval than others. I think something is wrong with applying the p.adjust function after group_by. It took me hours but could not figure out why... I appreciate if someone could help me with that.
below is the p.adjust function usage:
p.adjust(p, method = p.adjust.methods, n = length(p))
p.adjust.methods
# c("holm", "hochberg", "hommel", "bonferroni", "BH", "BY",
# "fdr", "none")
pval
values you have, it would be easier to determine what the results should be. – Philset.seed
. I get the same results whether I runp.adjust
indplyr
with a group by or if I pull a subset for one group out and run it by itself. It would also be good if you can explain why you "think something is wrong". It surprises me that I get the same value for two of the adjusted values in Group A (as do you), but this doesn't seem to have anything to do withdplyr
orgroup_by
.p.adjust(c(0.0983715529320762, 0.0183719095773995, 0.0271179967094213, 0.048782999929972), method = "BH")
gives the same result. – Gregor Thomas