Adding mean comparisons to plot + Is it possible to display p-values in ggplot (or R in general) from a KS test, specifically on a violin plot?

Question

So I'm seeking to create something like this:

using my own data, specifically using the p-values I found here:

Now, I was able to produce something similar albeit with not the correct method. Specifically, I was able to produce something similar using a T-test: T test p-value

I produced this by writing this code:

l<- ggplot(VioPos, aes(x=Regulation, y=Score,fill=Regulation)) +
  geom_violin(trim=FALSE)+
  labs(title="Plot of ARE Scores by Regulation",x="Gene Regulation", y = "ARE Score")+
  geom_boxplot(width=0.1,fill="white")+
  theme_classic()
l

dp <- l +  scale_y_continuous(trans="log2")
dp



dp7 <- dp +
  stat_compare_means(comparisons=my_comparisons, method="t.test")
dp7

In other words, I utilized stat_compare_means() using ggplot2/tidyverse/ggpubr/rstatix.

However, if I modify the method in the code, it seems to display correctly for Wilcoxon and T tests, but not for anova and kruskal wallis tests. Moreover, it seems that stat_compare_means() only supports those four and not KS, but I'm specifically interested in plotting mean comparisons from my KS test output onto my violin plots. Is there some other package I can use?

Also please note: for the KS test, the "UpScorePos" "DownScorePos" etc. was to compare ARE score by regulation (as I did with the graphs in the T test).

tester tester · Accepted Answer · 2021-03-02T23:10:44

You can get the p-value from a KS-test like this:

x <- rnorm(100)
y <- rnorm(100)
res <- ks.test(x, y)
res$p.value
[1] 0.9670685

Just use this p-value and add it to your plots.

EDIT: A somewhat hacky solution is to use run a t-test and get the right data structure that can be used with stat_pvalalue_manual and insert the pvalues from a ks.test. See the example below (I used the ToothGrowth data as an example).

# Transform `dose` into factor variable
df <- ToothGrowth
df$dose <- as.factor(df$dose)

stat.test <- df %>%
  t_test(len ~ dose)
stat.test

# prepare test tibble for ks.test
stat.test <- df %>%
  t_test(len ~ dose)
stat.test <- stat.test %>% add_y_position()
stat.test

kst <- stat.test # copy tibble to overwrite p-values for ks.test

p1 <- ks.test(x = ToothGrowth$len[ToothGrowth$dose == 0.5],
              y = ToothGrowth$len[ToothGrowth$dose == 1]
)$p
p2 <- ks.test(x = ToothGrowth$len[ToothGrowth$dose == 0.5],
              y = ToothGrowth$len[ToothGrowth$dose == 2]
)$p
p3 <- ks.test(x = ToothGrowth$len[ToothGrowth$dose == 1],
              y = ToothGrowth$len[ToothGrowth$dose == 2]
)$p

kst[, 'p'] <- as.numeric(c(p1, p2, p3))

ggplot(df, aes(x = dose, y = len)) +
  geom_violin(trim = F) +
  stat_pvalue_manual(kst, label = "p = {p}")

Adding mean comparisons to plot + Is it possible to display p-values in ggplot (or R in general) from a KS test, specifically on a violin plot?

1 Answers