Display regression equation and R^2 for each scatter plot when using facet_wrap

Question

I have a data.frame (which I melted using the melt function), from which I produce multiple scatter plots and fit a regression line using the following:

ggplot(dat, aes(id, value)) + geom_point() + geom_smooth(method="lm", se=FALSE) + facet_wrap(variable~var1, scales="free")

I would like to add the regression equation and the R^2 in each of these scatter plots for the relevant regression (i.e. the one produced by geom_smooth in each scatter plot).

var1 above is just the name of one of the id columns of the melted data and I am facing the same question with facet_grid instad of facet_wrap.

Yes, but I am not able to generalize it so the multiple scatter plots... — StephQ
Use ddply and the function from Ramnath's answer in that other question to create a data frame with both your faceting variables, x and y variables (locations for eqn in each panel) and a character variable for the eqn itself. Then just pass that data frame to geom_text. — joran

StephQ StephQ · Accepted Answer · 2012-03-23T11:50:31

I actually solved this, please see below a worked out example where the dependent variable is var1. This was a time series dataset, please ignore the date part if not relevant for your problem.

library(plyr)
library(ggplot2)

rm(dat)
dat <- read.table("data.txt", header = TRUE, sep = ",")
dat <- transform(dat, date = as.POSIXct(strptime(date, "%Y-%m-%dT%H:%M:%OS")))

rm(dat.m)
dat.m <- melt(dat, id = c('ccy','date','var1'))

lm_eqn = function(df){
  m = lm(var1 ~ value, df);
  eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2, 
                   list(a = format(coef(m)[1], digits = 2), 
                        b = format(coef(m)[2], digits = 2), 
                        r2 = format(summary(m)$r.squared, digits = 3)))
  as.character(as.expression(eq));                 
}

mymax = function(df){
  max(df$value)
}

rm(regs)
regs <- ddply(dat.m, .(ccy,variable), lm_eqn)
regs.xpos <- ddply(dat.m, .(variable), function(df) (min(df$value)+max(df$value))/2)
regs.ypos <- ddply(dat.m, .(ccy,variable), function(df) min(df$var1) + 0.05*(max(df$var1)-min(df$var1)))

regs$y <- regs.ypos$V1
regs$x <- regs.xpos$V1

rm(gp)
gp <- ggplot(data=dat.m, aes(value, var1)) + geom_point(size = 1, alpha=0.75) + geom_smooth() + geom_smooth(method="lm", se=FALSE, color="red") + geom_text(data=regs, size=3, color="red", aes(x=x, y=y, label=V1), parse=TRUE) + facet_grid(ccy~variable, scales="free")
ggsave("data.png", gp, scale=1.5, width=11, height=8)

Display regression equation and R^2 for each scatter plot when using facet_wrap

2 Answers