I frequently make boxplots where some of the categories are quite small and others have plentiful data, superimposed with jittered raw datapoints. I'm looking for a reliable way to hide the box and whiskers for categories that are very small (N<5). The goal is that those little categories would show just the raw data using a geom_point() layer, but the categories where it makes sense would get the box-and-whisker treatment. The thing that seemed obvious to me, mapping alpha in the geom_boxplot() layer to a factor variable based on N, does not work because alpha only controls the fill and maybe the outliers in geom_boxplot, not the box and whiskers.
I have found a kludgey solution in the past that worked as long as I was willing to waste the color parameter on this problem. However, often I want to actually use color for something else, and mapping it twice leads to gnarly output. Another kludgey solution that occurs to me is using a data subset from which small categories have been deleted - the problem with this plan is that it won't correctly handle situations when these categories are subject to position_dodge() (as the dodge will "see" too few categories).
Minimal example below.
df <- data.frame(group=factor(sample(c("A","B"), size=110, replace=TRUE)),
sex=factor(c(rep("M",50), rep("F", 50), rep("NB", 10))),
height=c(rnorm(50, 70, 6), rnorm(50, 63, 6), rnorm(10, 65, 6)))
dfsub <- filter(df, !(sex=="NB" & group=="A"))
ggplot(df, aes(x=group, y=height, colour=sex)) +
geom_boxplot(data=dfsub) +
geom_point(position=position_jitterdodge(jitter.width=0.2))