0
votes

I am trying to create a grouped bar graph with a dot plot of the raw values on each bar. In the reproducible example below there are three treatments across two time periods and so separating them by time period I'd expect to have three bars per time period. And then following from this, each bar's raw points plotted on top.

Now in the real dataset this has to come from two data frames because the value of each bar is calculated independently and doesn't represent a simple mean or other value that stat summary could achieve. I have looked at other examples and answers to similar questions, but they do not seem to work given that the data comes from two separate data frames. Additionally, in the example below I got pretty close and its just a hair off.

Simple Reproducible Example

df.stat <- data.frame("Y" = c(40, 30, 20, 30, 30, 30),"Time" = c(1,1,1,2,2,2), "Treatment" = c(1,2,3,1,2,3))

df.raw <- data.frame("Y.raw" = c(35,40,45,25,30,35,15,20,25,25,30,35,10,20,40,30,30,30),"Time.raw" = c(1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2), "Treatment.raw" = c(1,1,1,2,2,2,3,3,3,1,1,1,2,2,2,3,3,3))


Now if you run the code below it seems group does not produce the intended result, however, if you use fill instead of group you get the exact graph I was intending except the issue is the dots color varies per treatment instead of them all having one color across all groups.

ggplot() + geom_col(data = df.stat, aes(x = factor(df.stat$Time), y = df.stat$Y, fill = factor(df.stat$Treatment)), 
                    color = "black", size = 1, width = .8, position = "dodge") +
          geom_dotplot(data = df.raw, aes(x = factor(df.raw$Time.raw), y = df.raw$Y.raw, group = factor(df.raw$Treatment.raw)),
                      alpha = .8, position_dodge(width = .8), binaxis = "y", 
                      stackdir = "center", stackratio = 1)

Thanks for any help in advance!

1

1 Answers

0
votes

This seemed to work, using interaction to clarify that the dotplot should be dodged by both Treatment and Time. (One tip, sometimes you'll run into trouble if you use $ notation inside a ggplot aes() call; safer to refer to the column names alone.)

ggplot() + 
  geom_col(data    = df.stat, aes(x = factor(Time), y = Y, 
                                  fill = factor(Treatment)), 
           color = "black", size = 1, width = .8, position = "dodge") +
  geom_dotplot(data = df.raw, aes(x = factor(Time.raw), y = Y.raw, 
                                  group = interaction(factor(Time.raw),factor(Treatment.raw))),
               alpha = .8, position_dodge(width = .8), binaxis = "y", 
               stackdir = "center", stackratio = 1)

enter image description here