I'm trying to produce scatterplots with regression equation and r2 for grouped data.
I can do one, but with grouped data I'm finding trouble when calculating the equations and r2 for all groups in a way that can be automatically extracted and added as annotation.
I believe that I'm pretty close, just making some silly mistake but can't seem to identify it.
1 - First I create a function that creates a model and the string of characters with the results.
library(dplyr)
eqlabels <- function(iris){
m <- lm(Sepal.Length ~ Sepal.Width, iris);
eq <- substitute(italic(y) == a + b * italic(x) * "," ~~ italic(r) ^ 2 ~ "=" ~ r2,
list(a = format(coef(m)[1], digits = 3),
b = format(coef(m)[2], digits = 3),
r2 = format(summary(m)$r.squared, digits = 2)))
as.character(as.expression(eq));
}
I came as far as this, but on step 2 it all breaks down:
2 - Now I must use the function on the grouped data.
This post suggests the use of ddply (from plyr package). I tried to replace that with something equivalent from the dplyr package, as suggested here.
labelsP3 <- iris %>% group_by(Species) %>% do(eqlabels(.))
However, this results in warning message (and then it does not plot...): Warning message:
Error: Results are not data frames at positions: 1, 2, 3
As suggested here, I tried:
labelsP3 <- iris %>% group_by(Species) %>% do(with(eqlabels(iris)))
But this results in error:
Error in eval(substitute(expr), data, enclos = parent.frame()) : invalid 'envir' argument of type 'character'
The plotting should be fine like this, but I'm stuck at this stage.
plot3 <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) + geom_point(colour = "grey60") +
facet_grid(Species ~ .) +
stat_smooth(method = lm) +
annotate("text", label = labelsP3, parse = TRUE)
Thank you.