1
votes

I would like to add a diamond at the mean by subgroub. Sample data can be created by this function:

sim <- function(ngroups, propsub, mu, sd){
  sample <- list()
  nnames <- names(ngroups)
  for (i in 1:length(ngroups)) { sample[[i]] <- rnorm(ngroups[i], mean = mu[i], sd = sd) }
  sub <- sample(size=sum(ngroups), x=names(propsub), prob=propsub, replace=TRUE)
  dat <- data.frame(trt=factor(rep(nnames,times=ngroups), levels=nnames), y=unlist(sample), sub)
  return(dat) 
}

set.seed(123)
dtest <- sim(ngroups=c("kon"=50,"trt"=50), propsub=c("a"=0.5,"b"=0.5), 
                mu=c(0,0), sd=1)

See the head of the sample data here:

> head(dtest)
  trt           y sub
1 kon -0.56047565   b
2 kon -0.23017749   a
3 kon  1.55870831   a
4 kon  0.07050839   a
5 kon  0.12928774   b
6 kon  1.71506499   a

The most I can reach for is this plot for now:

library(ggplot2)
ggplot(data=dtest, aes(y=y, x=trt)) + geom_boxplot(aes(fill=sub)) + stat_summary(fun.y=mean, aes(group = sub), geom="point", shape=5, size=4)

So there are some diamonds, which seam to symbolize the right mean but at the wrong position. How can I change the position or tell R to not just use dtest$trt as a subgroup but also dtest$sub?

1
I played around with the position argument as well: ggplot(data=dtest, aes(y=y, x=trt, fill=sub)) + geom_boxplot() + stat_summary(fun.y=mean, geom="point", shape=5, size=4, position = position_dodge(width=0.75)) But the width has to be exactly stated and it produces an error saying "ymax not defined: adjusting position using y instead". How can I solve this problem?Charlotte

1 Answers

0
votes

Looks like stat_summary wants to plot the diamond in the middle of trt, where the geom_boxplot can split data in two using the fill value.

Here an alternative solution:

ggplot(data=dtest, aes(y=y, x=paste(trt,sub))) +
  geom_boxplot(aes(fill=sub)) +
  xlab("trt") +
  stat_summary(fun.y=mean, geom="point", shape=5, size=4)