0
votes

I am attempting to add a legend to my boxplot with this example data

    BM  math  loginc
    1    2     1.4523
    0    3     2.3415
    1    1     0.6524
    1    3     2.4562
    0    1     3.5231
    0    2     2.4532

Essentially, I have two groups BM = 0 and BM = 1, 3 categories in each group (math=1, 2 or 3), and a value of loginc.

boxcolors=c('gray70','orange','red','gray70','orange','red')

bothboxplot=ggplot(both, aes(x=math,y=loginc))+
  geom_boxplot(fill=boxcolors)+
  stat_summary(fun.y=mean,color=line,geom = "point",shape=3,size=2)+
  scale_x_discrete(name='Site Category')+
  scale_y_continuous(name='Log(Incidence/100,000)')+
  facet_grid(.~BM)

  bothboxplot

This yeilds the following plot:Boxplot

This plot is entirely correct except for the lack of a legend. I have played around with the placement of the aes() and it won't work. When aes() is placed within the ggplot() rather than the geom_plot(), my fill statement gives the error ("Error: Aesthetics must be either length 1 or the same as the data (187): fill".

Ideally the legend I would like would have names of the 1,2,3 math categories, their corresponding colors, and the (+) symbol in each box to be labelled "Mean".

2

2 Answers

0
votes

You need to pass a column for fill into the aesthetic:

df <- 
  tibble(
    loginc = rnorm(n = 12, mean = 0, sd = 1),
    BM = rep(c(0, 1), each = 6),
    math = rep(1:3, 4)
  ) %>% 
  mutate(math = factor(math))

df %>% 
  ggplot(aes(x = math, y = loginc, group = math, fill = math)) +
  geom_boxplot() +
  stat_summary(fun.y = mean, geom = "point", shape=3, size=2) +
  facet_grid(~ BM)

enter image description here

0
votes

The point is that you do not map a variable to the fill aestehtic, i.e. map math on fill and set fill color manually with scale_fill_manual:

library(ggplot2)

both <- data.frame(
  BM = sample(0:1, 100, replace = TRUE),
  math = sample(1:3, 100, replace = TRUE),
  loginc = runif(100)
)

bothboxplot <- ggplot(both, aes(factor(math), loginc, fill = factor(math))) +
  geom_boxplot() +
  stat_summary(fun = mean, geom = "point", shape = 3, size = 2) +
  scale_fill_manual(values = c("gray70", "orange", "red")) +
  scale_x_discrete(name = "Site Category") +
  scale_y_continuous(name = "Log(Incidence/100,000)") +
  facet_grid(. ~ BM)

bothboxplot