I have a data frame in R in which several of the columns are factors. I'd like to create a series of bar charts showing the relative sizes of each of the factor levels. I want to associate my own customized color palettes to each of the factors, and then customize the final layout of all of the bars and legends using the gridExtra package.
I wrote a an example script which I think should achieve that, however, I obtained a rather surprising result:
library(ggplot2)
library(grDevices)
library(gridExtra)
# Define some dummy data and put it in a data frame
fruit <- factor(c("apple", "orange", "pear", "pear", "pear",
"orange", "apple", "apple", "apple", "pear"))
cheese <- factor(c("cheddar", "mozarella", "gruyere", "gruyere", "gouda",
"parmesan", "gruyere", "gouda", "mozarella", "cheddar"))
mydata <- data.frame(fruit, cheese)
mydata$dummy <- 0
# Define some custom color schemes
foodclrs <- list()
# Plot the fruit factor in shades of red
h <- c(0.0, 0.0, 0.0)
s <- c(0.95, 0.85, 0.45)
v <- c(0.45, 0.85, 0.95)
foodclrs[[1]] <- hsv(h, s, v)
# Plot the cheese factor in shades of green
h <- c(0.33, 0.33, 0.33, 0.33, 0.33)
s <- c(0.95, 0.93, 0.85, 0.69, 0.45)
v <- c(0.45, 0.69, 0.85, 0.93, 0.95)
foodclrs[[2]] <- hsv(h, s, v)
# Create vectors with individualized text for each plot
bsiz=20
fillvars <- c("fruit", "cheese")
xlabels <- c("Fruits", "Cheeses")
lgdlabels <- c("Types of Fruit", "Types of Cheese")
# Generate a list of plots
plots <- list()
for (ii in 1:2) {
plots[[ii]] <- ggplot(data=mydata) +
geom_bar(aes_string(x="dummy", fill=fillvars[ii]),
position=position_stack(reverse=TRUE)) +
scale_fill_manual(values=foodclrs[[ii]], drop=FALSE) +
theme_bw(base_size=bsiz) +
labs(x=xlabels[ii], y="") +
theme(axis.ticks.y=element_blank(),
axis.text.y=element_blank()) +
guides(fill=guide_legend(title=lgdlabels[ii])) +
coord_flip()
# print(plots[[ii]])
}
# Print the plots on my own custom-shaped grid
print(grid.arrange(plots[[1]], plots[[2]], ncol=1, nrow=2))
The output of the script looks like this:
This is not what I was expecting: the color palette for the upper bar chart should have been a range of shades of red. It seems that, although I originally defined the plot object plots[[1]]
to have a red color palette associated with it, when I actually went to print it, either R or ggplot2 (I'm not sure which) decided to use the most recent color palette instead; i.e., the one associated with plots[[2]]
.
Now here's the weird part. If I uncomment the print statement in the for
loop, I get two individual plots which are rendered in the correct color scheme (for brevity, I do not bother to include either of them here), and, even more interestingly, the combined bar chart object inside the grid.arrange()
function now also displays the correct color scheme:
While I am happy to have stumbled across this little workaround, now I'm curious: why does it even work in the first place?
That is to say, how is it that calling the "print" statement just at the correct moment inside of the for
loop causes a color palette to become permanently attached to each ggplot object, when otherwise it would not?
What's really going on here, "underneath the hood", so to speak? And also, is there a less kludgy way that I could correct the problem? For example, is there some other function that I could call instead of print()
, to get the color palette to attach correctly to each plot object, without creating a bunch of individual "dummy" plots that I don't actually need?
ggplotGrob()
. – baptisteggplotGrob()
function, even more than I like your official answer--it seems simpler and cleaner than encapsulating the ggplot call within a function and using purrr::pmap() to implement the loop. However, I did upvote both of them, since both ways seemed like valid solutions. – stachyra