1
votes

In this scenario I have added a grouping variable in the iris dataframe. I wish to make a boxplot of Sepal.Length by Species and filled by the grouping variable with the outliers identified with a label. This all works but when I try to label the outlier with geom_text, they do now print with the grouped position but instead in the center. It seems geom_text is not inheriting the global aes() but I don't know why.

code:

library(tidyverse)
# function to id outlier
is_outlier <- function(x) {
  return(x < quantile(x, 0.25) - 1.5 * IQR(x) | x > quantile(x, 0.75) + 1.5 * IQR(x))
}

# make a grouping variable
iris$group <- sample(1:3, nrow(iris),replace = T)

# make a outlier variable
iris <- 
  iris %>%
  group_by(Species, group) %>%
  mutate(outlier = ifelse(is_outlier(Sepal.Length), Sepal.Length, as.numeric(NA)))
iris$outlier

# graph
iris %>%
  ggplot(aes(x = Species,y = Sepal.Length, fill = factor(group))) +
  geom_boxplot() +
  geom_text(aes(label = outlier))

labels are in the center rather than over their respective box. What's going on here?

1

1 Answers

3
votes

This is due to dodging in the boxplot once you have the group. Use position_dodge to explicitly control it. You may want to experiment with the hjust and vjust arguments in geom_text to avoid plotting over the point.

iris %>%
  ggplot(aes(x = Species,y = Sepal.Length, fill = factor(group))) +
  geom_boxplot(position = position_dodge(width = 1)) +
  geom_text(aes(label = outlier), position = position_dodge(width = 1))

enter image description here