I want to visualize mean comparison with a boxplot in ggplot2, but instead of having a vector of categorical variables, I have a couple of vectors with 1 or 0 to indicate whether they belong in that category. There's some overlap - i.e., some data points will belong to multiple groups simultaneously.
I'm able to get a boxplot of values for all the values in one group, but not able to add another group's values to the same plot. With as.factor() applied to a dummy variable I'm able to get a boxplot of the means of scores for those in that group vs. not in that group. I've seen posts about faceting that seem like that might be helpful, but none of the examples I've found (Multiple boxplots placed side by side for different column values in ggplot, How do I make a boxplot with two categorical variables in R?) are quite like what I'm trying to do.
score <- c(1, 8, 3, 5, 10, 7, 4, 3, 8, 1)
group1 <- c(0, 0, 1, 0, 1, 1, 0, 1, 0, 1)
group2 <- c(1, 1, 0, 1, 0, 1, 1, 1, 0, 0)
group3 <- c(0, 1, 0, 0, 0, 0, 0, 0, 1, 1)
df <- data.frame(score, group1, group2, group3)
library(ggplot2)
ggplot(aes(y=score, x=as.factor(group1), fill=group1), data=df) +
geom_boxplot() #mean for both values inside and outside group plotted
ggplot(aes(y=score, x=as.numeric(group1), fill=group1), data=df) +
geom_boxplot() #mean for just those values where group1 == 1
I want to end up with either a) multiple plots like what I get from that first line of code, OR b) multiple plots like what I get from the second. The former includes a boxplot for all those values outside the group, the latter does not. Would also be cool to have a boxplot for the overall mean but I really am not sure what's feasible.