2
votes

I would like to be able to extend my boxplots with additional information. Here is a working example for ggplot2:

library(ggplot2)

ToothGrowth$dose <- as.factor(ToothGrowth$dose)

# Basic box plot
p <- ggplot(ToothGrowth, aes(x=dose, y=len)) + 
  geom_boxplot()
# Rotate the box plot
p + coord_flip()

I would like to add additional information from a separate data frame. For example:

extra <- data.frame(dose=factor(c(0.5,1,2)), label=c("Label1", "Label2", "Label3"), n=c("n=42","n=52","n=35"))

> extra
  dose  label    n
1  0.5 Label1 n=42
2    1 Label2 n=52
3    2 Label3 n=35

I would like to create the following figure where the information to each dose (factor) is outside the plot and aligns with each of the dose levels (I made this in powerpoint as an example):

enter image description here

Any advice is greatly appreciated!

Regards, Luc

EDIT: Thanks lot for you answer! Perfect. I would like to ask advice for an extension of the initial question.

What about this extension where I use fill to split up dose by the two groups?

ToothGrowth$dose <- as.factor(ToothGrowth$dose)
ToothGrowth$group <- head(rep(1:2, 100), dim(ToothGrowth)[1])
ToothGrowth$group <- factor(ToothGrowth$group)

 p <- ggplot(ToothGrowth, aes(x=dose, y=len, fill=group)) + 
     geom_boxplot()
 # Rotate the box plot
 p + coord_flip()

extra <- data.frame(
  dose=factor(rep(c(0.5,1,2), each=2)), 
  group=factor(rep(c(1:2), 3)), 
  label=c("Label1A", "Label1B", "Label2A", "Label2B", "Label3A", "Label3B"), 
  n=c("n=12","n=30","n=20", "n=32","n=15","n=20")
)

Is it possible to align data from the new data frame (extra, 6 rows) with each of the dose/group combinations?

Cheers, Luc

1

1 Answers

2
votes

We can use geom_text with clip = "off" inside coord_flip:

ggplot(ToothGrowth, aes(x=dose, y=len)) +
    geom_boxplot() +
    geom_text(
        y = max(ToothGrowth$len) * 1.1,
        data = extra,
        aes(x = dose, label = sprintf("%s\n%s", label, n)),
        hjust = 0) +
    coord_flip(clip = "off") +
    theme(plot.margin = unit(c(1, 5, 0.5, 0.5), "lines"))

enter image description here

Explanation: We place text outside of the plot area with geom_text and disable clipping with clip = "off" inside coord_flip. Lastly, we increase the plot margin to accommodate the additional labels. You can adjust the vertical y position in the margin (so the horizontal position in the plot because of the coordinate flip) by changing the factor in y = max(ToothGrowth$len) * 1.1.


In response to your edit, here is a possibility

extra <- data.frame(
  dose=factor(rep(c(0.5,1,2), each=2)),
  group=factor(rep(c(1:2), 3)),
  label=c("Label1A", "Label1B", "Label2A", "Label2B", "Label3A", "Label3B"),
  n=c("n=12","n=30","n=20", "n=32","n=15","n=20")
)

library(tidyverse)
ToothGrowth %>%
    mutate(
        dose = as.factor(dose),
        group = as.factor(rep(1:2, nrow(ToothGrowth) / 2))) %>%
    ggplot(aes(x = dose, y = len, fill = group)) +
    geom_boxplot(position = position_dodge(width = 1)) +
    geom_text(
        data = extra %>%
            mutate(
                dose = as.factor(dose),
                group = as.factor(group),
                ymax = max(ToothGrowth$len) * 1.1),
        aes(x = dose, y = ymax, label = sprintf("%s\n%s", label, n)),
        position = position_dodge(width = 1),
        size = 3,
        hjust = 0) +
    coord_flip(clip = "off", ylim = c(0, max(ToothGrowth$len))) +
    theme(
        plot.margin = unit(c(1, 5, 0.5, 0.5), "lines"),
        legend.position = "bottom")

enter image description here

A few comments:

  1. We ensure that labels match the dodged bars by using position_dodge(with = 1) inside geom_text and geom_boxplot.
  2. It seems that position_dodge does not like a global y (outside of aes). So we include the y position for the labels in extra and use it inside aes. As a result, we need to explicitly limit the range of the y axis. We can do that inside coord_flip with ylim = c(0, max(ToothGrowth$len)).