2
votes

I have a simple data set with two groups and a value for each group at 4 different time points. I want to display this data set as grouped boxplots over time but ggplot2 doesn't separate the time points.

This is my data:

 matrix
    Replicate Line Day Treatment  X A WT     Marker Proportion
            1    C  10       low NA      HuCHuD_Pos       8.62
            2    C  10       low NA      HuCHuD_Pos         NA
            1    C  18       low NA      HuCHuD_Pos      30.50                                                    
            3    C  18       low NA      HuCHuD_Pos         NA
            2    C  18       low NA      HuCHuD_Pos         NA
            1    C  50       low NA      HuCHuD_Pos      26.10
            2    C  50       low NA      HuCHuD_Pos      31.90
            1    C  80       low NA      HuCHuD_Pos      12.70
            2    C  80       low NA      HuCHuD_Pos      26.20
            1    C  10    normal NA      HuCHuD_Pos         NA
            2    C  10    normal NA      HuCHuD_Pos      17.20
            1    C  18    normal NA      HuCHuD_Pos       3.96
            2    C  18    normal NA      HuCHuD_Pos         NA
            1    C  50    normal NA      HuCHuD_Pos      25.60
            2    C  50    normal NA      HuCHuD_Pos      17.50
            1    C  80    normal NA      HuCHuD_Pos      19.00
           NA    C  80    normal NA      HuCHuD_Pos         NA

And this is my code:

matrix = as.data.frame(subset(data.long, Line == line_single & Marker == marker_single & Day != "30"))

pdf(paste(line_name_single, marker_name_single, ".pdf"), width=10, height=10)
plot <- 
ggplot(data=matrix,aes(x=Day, y=Proportion, group=Treatment, fill=Treatment)) +
geom_boxplot(position=position_dodge(1))   
print(plot)
dev.off()

What do I do wrong?

What I want

What I get

Thanks very much for your help!

Cheers, Paula

1
Does this answer your question? Plot multiple boxplot in one graphA. Suliman

1 Answers

3
votes

Edit:

This is how a minimal reproducible example for your question could look like:

matrix <- structure(list(Day = c(10L, 10L, 18L, 18L, 18L, 50L, 50L, 80L, 80L, 10L, 10L, 18L, 18L, 50L, 50L, 80L, 80L),
                         Treatment = c("low", "low", "low", "low", "low", "low", "low", "low", "low", "normal", "normal", "normal", "normal", "normal", "normal", "normal", "normal"), 
                         Proportion = c(8.62, NA, 30.5, NA, NA, 26.1, 31.9, 12.7, 26.2, NA, 17.2, 3.96, NA, 25.6, 17.5, 19, NA)),
                    class = "data.frame", row.names = c(NA, -17L))

Suggested answer using factor to 'discretisize' the variable Day:

ggplot(data=matrix,aes(x=factor(Day), y=Proportion,  fill=Treatment)) +
  geom_boxplot(position=position_dodge(1)) +
  labs(x ="Day")

enter image description here

Explanation: If we pass a continuous variable to the 'x' axis for a box-plot, ggplot2 does not convert the axis to a discrete variable. Therefore, in lack of a 'grouping' variable we only get one box. But if we convert the variable to something discrete, like a factor, a string or a date, we get the desired behavior.

Also, when you use dput or one of the techniques described here it's way easier to find and test an answer than having to try and work with the data description as in the question (or at least I couldn't figure out how to load that example data)

P.S. I think it's a bit confusing to name a variable of class data.frame 'matrix' since matrix is its own data type in R... ;)