10
votes

I am a big fan of facet_wrap. Though it is fast to split a big data frame and plot several plots and explore within R, it's not always the best tool for presenting in paper or in slides. I find myself wasting a lot of time with scales, binwidths and font sizes and ultimately modifying on inkscape the plot.

Sometimes I just subset my data frame into many data frames and plot individually for each one. Later join them with multiplot or by hand.

I was wondering if there might be a way of making the ggplot call almost in the same way (one big df with a factor column used for faceting) or a way to make ggplot read from something with list-like data frame separated by my faceting factor. The ideal output should be multiple single plots which I'll edit later on inkscape (and use free_y scales to make it less painful)

To be clear,

df<-mtcars
ggplot(df,aes(df$mpg,df$disp,color=factor(cyl)))+
    geom_point(aes(df$mpg,df$disp))+
    facet_wrap( ~cyl)

Produces one plot. My desired output in this case would be three plots, one for each facet.

1

1 Answers

8
votes

You can use lapply to create a list with one plot for each value of cyl:

# Create a separate plot for each value of cyl, and store each plot in a list
p.list = lapply(sort(unique(mtcars$cyl)), function(i) {
  ggplot(mtcars[mtcars$cyl==i,], aes(mpg, disp, colour=factor(cyl))) +
    geom_point(show.legend=FALSE) +
    facet_wrap(~cyl) +
    scale_colour_manual(values=hcl(seq(15,365,length.out=4)[match(i, sort(unique(mtcars$cyl)))], 100, 65))
})

The complicated scale_colour_manual argument is a way to color the point markers the same way they would be colored if all the values of cyl were included in a single call to ggplot.

UPDATE: To address your comments, how about this:

# Fake data
set.seed(15)
dat = data.frame(group=rep(c("A","B","C"), each=100),
                 value=c(mapply(rnorm, 100, c(5,10,20), c(1,3,5))))

p.list = lapply(sort(unique(dat$group)), function(i) {
  ggplot(dat[dat$group==i,], aes(value, fill=group)) +
    geom_histogram(show.legend=FALSE, colour="grey20", binwidth=1) +
    facet_wrap(~group) +
    scale_fill_manual(values=hcl(seq(15,365,length.out=4)[match(i, sort(unique(dat$group)))], 100, 65)) +
    scale_x_continuous(limits=range(dat$value)) +
    theme_gray(base_size=15)
})

The result is below. Note that the code above gives you the same x-scale on all three graphs, but not the same y-scale. To get the same y-scale, you can either hard-code it as, say, scale_y_continuous(limits = c(0,35)), or you can find the maximum count programmatically for whatever binwidth you set, and then feed that to scale_y_continuous.

# Arrange all three plots together
library(gridExtra)
do.call(grid.arrange, c(p.list, nrow=3))

enter image description here