I am making grouped violin plots of my own dataset using ggplot2. The dataset contains 350 observations (7 scenarios in 5 locations and every situation has 10 replicates) of 3 variables and part of it looks like this:
The codes I used are here:
DF = read.csv("C:\\Users\\lqy\\Desktop\\Pilot_data.csv", na.strings = "---", header = TRUE)
DF = data.frame(DF)
DF$Scenarios = as.integer(DF$Scenarios)
figure = ggplot(DF, aes(x = Location, y = Recovery, fill = Scenarios)) +
geom_violin() +
stat_summary(fun="median",geom="point") +
labs(x="Locations", y="Days to 90% recovery") +
theme(axis.text = element_text(size = 10)) +
theme(axis.title = element_text(size = 10)) +
theme(legend.position = "right")
figure
From these codes I have a figure that looks like this:
I am quite happy with this figure but the median points I added seem to be clustered in the middle of the plot, instead of in each violins. I'm guessing this is because they are supposed to align right above the point on the x-axis. But is there a way to put the median points in each corresponding violin in this kind of plot?
Many thanks for answering this question for me!
ADDITION: the dataset is here
(acquired by the code dput(DF[sample(nrow(DF),45),])
)
structure(list(Scenarios = c(8L, 2L, 2L, 2L, 10L, 5L, 5L, 10L,
10L, 3L, 10L, 1L, 2L, 5L, 8L, 2L, 1L, 3L, 1L, 8L, 10L, 4L, 8L,
2L, 4L, 3L, 8L, 10L, 1L, 1L, 10L, 5L, 3L, 8L, 8L, 5L, 8L, 5L,
10L, 1L, 8L, 8L, 8L, 3L, 10L), Location = c("Total_Catchment",
"Sec_51", "Sec_53", "Total_Catchment", "Sec_55", "Sec_55", "Sec_51",
"Sec_51", "Sec_54", "Total_Catchment", "Sec_55", "Sec_55", "Sec_54",
"Sec_53", "Sec_51", "Sec_55", "Sec_53", "Sec_55", "Sec_54", "Total_Catchment",
"Sec_51", "Sec_53", "Sec_55", "Total_Catchment", "Sec_54", "Total_Catchment",
"Sec_53", "Sec_53", "Sec_51", "Sec_54", "Sec_53", "Sec_51", "Sec_53",
"Sec_54", "Sec_54", "Sec_55", "Sec_55", "Sec_54", "Sec_51", "Sec_51",
"Sec_51", "Total_Catchment", "Sec_51", "Sec_55", "Sec_53"), Recovery = c(316.5,
839.5, 179.5, 277.5, 923.5, 664.5, 494.5, 639.5, 273.5, 327.5,
830.5, 714.5, 357.5, 300.5, 504.5, 752.5, 265.5, 535.5, 208.5,
303.5, 564.5, 339.5, 766.5, 396.5, 273.5, 271.5, 185.5, 370.5,
825.5, 191.5, 186.5, 582.5, 364.5, 326.5, 332.5, 901.5, 706.5,
187.5, 577.5, 680.5, 506.5, 301.5, 559.5, 713.5, 324.5)), row.names = c(20L,
121L, 163L, 37L, 329L, 348L, 103L, 112L, 273L, 52L, 322L, 309L,
240L, 187L, 76L, 338L, 155L, 339L, 253L, 69L, 133L, 158L, 342L,
2L, 235L, 45L, 146L, 161L, 106L, 239L, 189L, 117L, 157L, 265L,
258L, 299L, 321L, 215L, 98L, 127L, 132L, 27L, 111L, 283L, 203L
), class = "data.frame")
dput(DF)
and paste the output in your question in order to reproduce the problem. – Duckdput(DF[sample(nrow(DF),45),])
and paste the output in your question! – Duck