1
votes

I am generating a faceted plot with different groups and I want to compare the mean of different groups to a control sample. I have a wide range of values, and want to position the "significance label" programmatically on top of each boxplot.

Here is some toy data.

library(ggpubr)
ggboxplot(mtcars, x = "am", y = "mpg") +
facet_wrap(~vs) +
stat_compare_means(aes(label = ..p.signif..),
                      method = "t.test", ref.group = "0")

Gives the following plot, that only takes into account the highest value in facet 1. enter image description here

I would like to place the asterisk on the faceted plot 0 at the top of the box, like it is in the plot 1. In this case I can extract the maximum value of each boxplot by:

lab_coords = mtcars %>% group_by(am,vs) %>% 
             summarize(max_mpg = max(mpg)) %>% pull(max_mpg)
lab_coords
[1] 19.2 24.4 26.0 33.9

When I pass the label coordinates to stat_compare_means I cannot place them in the correct order:

ggboxplot(mtcars, x = "am", y = "mpg") +
    facet_wrap(~vs) +
    stat_compare_means(aes(label = ..p.signif..),
                       method = "t.test", ref.group = "0",
                       label.y = lab_coords)

enter image description here

Is there any way of passing different ylabel positions to each facet?

1

1 Answers

0
votes

One solution is to calculate your statistics separately and then annotate the plot with the p-values. This has the advantage of accounting for multiple testing (you can use adjusted p-values, rather than 'incorrect' unadjusted p-values), e.g.

library(ggpubr)
library(tidyverse)
library(ggsignif)
anno_df <- compare_means(mpg ~ am, group.by = "vs",
                         data = mtcars, method = "t.test") %>% 
  mutate(max_mpg = mtcars %>%
           group_by(am, vs) %>%
           summarize(max_mpg = max(mpg)*1.2) %>%
           filter(vs == 1) %>% 
           ungroup()) %>% 
  mutate(p.adj.sig = ifelse(p.adj < 0.001, "***",
                            ifelse(p.adj < 0.01, "**",
                                   ifelse(p.adj < 0.05,
                                          "*", "ns"))))

ggboxplot(mtcars, x = "am", y = "mpg", facet.by = "vs") +
  ggsignif::geom_signif(data=anno_df,
                        aes(xmin=group1, xmax=group2,
                            annotations=p.adj,
                            y_position=max_mpg[["max_mpg"]]),
                        manual=TRUE)

example_3.png

There is a fair amount of discussion of this problem with other potential solutions on github: https://github.com/kassambara/ggpubr/issues/65