0
votes

I have a data frame with three variables (n, Parametric, Mean) in column format. n takes on values 25, 50, or 100. Parametric takes on either "Yes" or "No". Mean takes on a numerical value. Similar to the very simplified version here:

n     Parametric     Mean
25       Yes          1.2
25       No           1.5
50       Yes          0.9
50       No           1.1
100      Yes          1.0
100      No           1.2

I would like to make a boxplot comparing the Parametric mean values (mean values that have parametric=yes) vs. Non parametric mean values (mean values that have parametric=no) for each of the three different n values.

Similar to the image below: https://www.sthda.com/sthda/RDoc/figure/ggplot2/ggplot2-box-plot-box-plot-multiple-groups-data-visualization-1.png

Except I want my legend to be Parametric: Yes or No, the x-values to be n, and the y values to be the mean values.

The code:

    # Create boxplot comparisons
    ggplot(dataMean, aes(x=n, y=Mean, color=Parametric))+
        geom_boxplot()

is only giving me something that has two boxplots, one for Parametric=yes and one for Parametric=no. I am looking for a plot that compares parametric=yes vs parametric=no for each n value. So ultimately I want 6 boxplots (2 boxplots for each n value), color coordinated by their parametric column.

How should I organize my data in order to be able to achieve this result? And what code would then create a plot with these 6 boxplots?

1

1 Answers

1
votes

Maybe facet_wrap() can help you. I can demonstrate this with ToothGrowth dataset:

data("ToothGrowth")
ggplot(ToothGrowth, aes(x = supp, y = len, fill = supp)) + 
  geom_boxplot(position = position_dodge()) +
  facet_wrap(~dose)

and the plot is like:

enter image description here