4
votes

I was wondering whether there is a "direct" manner to link the slope of a regression line in a ggplot facet panel to the background colour of that panel (i.e. to visually seperate positive slopes from negative slopes in a large grid).

I understand how to add a regression line in GGplots - as was well explained on Adding a regression line to a facet_grid with qplot in R

I also understand how to change the background if you have previously added this information to the original dataframe - as explained on Conditionally change panel background with facet_grid?

However - is there a way to do this "in the geom_rect" formula without having to e.g. run the regression seperately, bind them to the original dataframe, and then use this as a variable for geom_rect()? is there a way for geom_rect() to use the information from stat_smooth()?

Wouter

good example of a simple regression line plot from earlier question:

library(ggplot2)
x <- rnorm(100)
y <-  + .7*x + rnorm(100)
f1 <- as.factor(c(rep("A",50),rep("B",50)))
f2 <- as.factor(rep(c(rep("C",25),rep("D",25)),2))
df <- data.frame(cbind(x,y))
df$f1 <- f1
df$f2 <- f2

ggplot(df,aes(x=x,y=y))+geom_point()+facet_grid(f1~f2)+stat_smooth(method="lm",se=FALSE)
1
One always invites trouble when saying this, but I'm going to say no, you can't do what you describe. The way to achieve this effect is to add the slopes as a variable and then map that variable to the background colour in geom_rect.joran
One issue with attempting this without modelling outside ggplot is that you don't know if the slopes are actually different from zero in any real sense (excepting the trivial nominal one). You run a risk of coloring based mainly on statistical noise, as in the example below.MattBagg

1 Answers

4
votes

This is not exactly a solution, but a work-around. But it seems to have come out good. Both the posts you linked to had each part of the solution. James' solution here tells you how to extract the fitted values from stat_smooth. Joran's solution here tells how to use geom_rect to fill the background.

# generating data: Usage of set.seed for reproducibility 
# also I changed the multiplication constant to 0.1 to have 
# at least one negative slope.

require(ggplot2)
set.seed(12)
x <- rnorm(100)
y <-  + .1*x + rnorm(100)
f1 <- as.factor(c(rep("A",50),rep("B",50)))
f2 <- as.factor(rep(c(rep("C",25),rep("D",25)),2))
df <- data.frame(cbind(x,y))
df$f1 <- f1
df$f2 <- f2

# first generate your plot in this manner and run it
# from James' post, the part outfit=fit<<-..y.. will store 
# the output of fitted values in "fit"

g <- ggplot(df,aes(x=x,y=y)) + geom_point()+facet_grid(f1~f2) 
g <- g + stat_smooth(aes(outfit=fit<<-..y..), method="lm",se=FALSE)
# now run g to generate "fit"
g

# now extract the slope for each facet and 
# construct the data.frame for geom_rect (as per Joran's post)
# Edit: Just to add more info about "fit". By default it contains
# 80 values per facet. Hence the 80*4 = 320

slopes <- fit[seq(2, 320, by = 80)] - fit[seq(1, 320, by = 80)]
tp <- unique(df[, c('f1', 'f2')])
tp <- transform(tp, slopes=slopes, x=1, y=1)
tp$pos_neg <- ifelse(slopes > 0, 1, 0)
tp$pos_neg <- factor(tp$pos_neg)

# now plot again (but with geom_rect)
g <- ggplot(df,aes(x=x,y=y)) 
g <- g + geom_rect(data = tp, aes(fill = pos_neg), xmin = -Inf, xmax = Inf, ymin = -Inf, ymax = Inf, alpha = 0.5) 
g <- g + geom_point() + facet_grid(f1~f2) + stat_smooth(method = "lm",se = FALSE)
g

The output looks like this. I'm not sure if this is what you expect though.. Strictly speaking, you do calculate the fitted values twice, but both times you calculate it implicitly with stat_smooth. Like I said, its just a work-around.