I want to display "n = (n)" over the whiskers of each of my boxplots. I have figured out how to put these labels over the top of each box (q75) using fivenum, but I can't get them working above the whisker. Above the whiskers is better because my plots are very cluttered.
Here I've reproduced the plots using mtcars Edit: mtcars has no significant outliers, but my dataset does. That's why the label needs to be on top of the whisker, and not just on the highest data point.
sidenote: I am working with a lot of outliers and want to take them out of the display. GGplot can do this, but it will still include outliers in the axis, which gives me a very "zoomed out" plot. My workaround for this is included. I've used the base boxplot function to calculate the highest whisker, and used coord_cartesian to set the upper limit just above that.
> data("mtcars")
> head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
>
> d = data.table(mtcars)
>
> give.n <- function(x){
+ return(data.frame(y = fivenum(x)[4],
+ label = paste("n =",length(x))))
+ }
>
> p1 <- boxplot(mpg~cyl, data=mtcars, outline=FALSE,
+ plot=0)
> p1stats <- p1$stats[5,]
> head(p1stats)
[1] 33.9 21.4 19.2
> upperlim <- max(p1$stats, na.rm = TRUE) * 1.05
>
> p <- ggplot(d, aes(x=factor(cyl), y=mpg)) +
+ geom_boxplot() +
+ stat_summary(fun.data = give.n, geom = "text", vjust=-.5)
>
> p <- p + coord_cartesian(ylim = c(0, upperlim))
I tried changing this function (which works):
> give.n <- function(x){
+ return(data.frame(y = fivenum(x)[4],
+ label = paste("n =",length(x))))
+ }
To this, using the 5th row of p1 stats (the upper whiskers):
give.n <- function(x){
return(data.frame(y = p1stats,
label = paste("n =",length(x))))
}
But that returns this: bad plot
How do I get this to display the label on only the correct whisker point for each box?
PS - My apologies, I'm unfamiliar with posting here but I tried

