1
votes

Let's say I have the following data set:

set.seed(1)
df <- data.frame(
  Index = 1:10,
  Heat = rnorm(10),
  Cool = rnorm(10),
  Other = rnorm(10),
  a = rnorm(10),
  b = rnorm(10)
)

Now I want to make a line graph of each of the columns against Index. I do this the following way:

df.plot <- ggplot(
  data = tidyr::gather(df, key = 'Variable', value = 'Component', -Index),
  aes(x = Index, y = Component, color = Variable)
) +
  geom_line()

but now I want to change it so that the variables Heat, Cool, and Other are red, blue, and green respectively. So I tried something like:

set.colors <- c(Heat = 'red', Cool = 'blue', Other = 'green')
df.plot + scale_color_manual(values = set.colors)

The problem here is that the set.colors variable doesn't have enough colors (a and b aren't represented) but I just want ggplot to automatically assign colors to both of these variables because in my actual code, there's no way of telling how many of these columns there will be. So basically I want ggplot to do it's normal color assignment and then search for any variables that are names Heat, Cool, or Other (there's no guarantee that any or all of these three will be present) and then change their colors to red, blue, and green respectively without changing the colors of any other variable.

2
It would take a bit work to hack something like this together. You could write your own function that looks at the number of levels in whatever data column you want, takes out your special levels, mimics ggplot color assignment for the rest and tacks your special values back on.Gregor Thomas
Seems like a recipe for producing some really ugly palettes...Gregor Thomas
I actually started working on a function that starts with a built in palette, assigns each level to a color and then replaces the colors for Heat, Cool and Other. I just thought there might be a simpler way.stat_student

2 Answers

1
votes

Mixing your own colors in with a default color palette is a breathtakingly bad idea. Nevertheless, here is one way to do it - similar to the other answer but perhaps a bit more general, and uses ggplot's default color palette for everything else as you asked.

library(ggplot2)
library(reshape2)
gg.df <- melt(df, id="Index", value.name="Component")
ggp <- ggplot(gg.df, aes(x = Index, y = Component, color = variable)) +
  geom_line()

lvls  <- levels(gg.df$variable)
cols  <- setNames(hcl(h=seq(15, 375, length=length(lvls)+1), l=65, c=100),lvls)
cols[c("Heat","Cool","Other")] <- c("#FF0000","#0000FF","#00FF00")

ggp + scale_color_manual(values=cols)

Edit: Just realized that I never said why this is a bad idea. This post gets into it a bit, and has a few really good references. The main point is that the default colors are chosen for a very good reason, not just to make the plot "look pretty". So you really shouldn't mess with them unless there's an overwhelming need.

0
votes

Something like the following might work. First I set up the colour scale:

plot_data <- tidyr::gather(df, key = 'Variable', value = 'Component', -Index)
vars <- levels(plot_data$Variable)
colours <- character(length(vars))
colours[vars=="Heat"] <- "red"
colours[vars=="Cool"] <- "blue"
colours[vars=="Other"] <- "green"
other_colours <- c("orange", "purple", "brown", "gold")
others <- !(vars %in% c("Heat", "Cool", "Other"))
colours[others] <- other_colours[1:sum(others)]

The idea is to manually assign your desired colours first, and then assign colours from some list to the other elements. If you need more colours for other_colours, you can get a complete list of named colours using colours().

Then the plot is produced by:

ggplot(plot_data, aes(Index, Component, colour = Variable)) + 
          geom_line() +
          scale_colour_manual(values = colours)

I don't think that it is possible to use scale_colour_manual and still let ggplot pick some colours automatically.