I am sort of puzzled with the outcome of the code here below. The data frame I called aux (the data) contains a factor and a quantitative variable. I want to plot mean values of the quantitative variable according to levels of the factor.
The code creates also a second data frame containing those grouped mean values.
Then there are two plots. The first one is fine by me: it plots the right values in two different ways, that is using stat_summary() on the original aux data frame or geom_point() on the aux.grouped data frame.
However, when I try to plot the log10 values of the quantitative variable, stat_summary() does not plot what I would have expected. I get that the use of log10 under aes on the ggplot mapping line may at the origin of this issue. What I do not get is what is stat_summary() plotting instead and why does not it plot, if it comes to an unmatched mapping issue, the non-log10 values instead.
Thanks a lot for your help.
Best,
David
aux <- read.table("aux.txt", header = TRUE, sep = "\t")
aux$nb.NAs <- factor(aux$nb.NAs)
aux.grouped <- aux %>%
group_by(nb.NAs) %>%
dplyr::summarise(mean_values = mean(values))
ggplot(aux, aes(x = nb.NAs, y = values, group = nb.NAs)) +
stat_summary(geom = "point", fun = "mean", colour = "red", size = 10) +
geom_point(data = aux.grouped, aes(x = nb.NAs, y = mean_values), colour = "blue", size = 5)
ggplot(aux, aes(x = nb.NAs, y = log10(values), group = nb.NAs)) +
stat_summary(geom = "point", fun = "mean", colour = "red", size = 5) +
geom_point(data = aux.grouped, aes(x = nb.NAs, y = log10(mean_values)), colour = "blue", size = 5)