2
votes

I'm currently writing a function to make a certain plot. The plot compares group-specific relative mean deviation (respectively effect sizes) to a grand sample mean and shows the overall mean deviation per group as horizontal lines. I have already worked out how to get different colours for the groups in the plot, but I can't get the lines to also be coloured in their group-specific colour.

Here's some example data:

a <- c(runif(10))
names(a) <- c(paste0("v", 1:10))
b <- c(runif(10))
names(b) <- c(paste0("v", 1:10))
c <- c(runif(10))
names(c) <- c(paste0("v", 1:10))

dev_list <- list(a,b,c)
names(dev_list) <- c("group1", "group2", "group3")

First I melt the data:

require(reshape2)
df_aux <- data.frame(matrix(unlist(dev_list), nrow=length(dev_list), byrow=T), class = c(names(dev_list)))
colnames(df_aux) <- c(c(paste0("v", 1:10)), "class")
df_melt <- melt(df_aux)

I then create a data frame for the cutoff lines:

ml_list <- mapply(function(A, B) {
  list(data.frame( x = c(-Inf, Inf), y = mean(A), cutoff = factor(paste(B, "Mean Deviation", sep = " "))))
}, dev_list, c(names(dev_list)))

Then I plot the data:

require(ggplot2)
p <- ggplot(df_melt, aes(variable, value)) + geom_point(aes(colour = class))

So far, so good, but when I now insert the lines, I don't know how to again refer the class-variable for defining the colour of the lines. My code so far:

p <- p + lapply(ml_list, function(z) {geom_line(aes(x,y, linetype = cutoff), z)})

The plot p afterwards looks like this:

Link to the plot as apparently I cannot include it in the post yet

I've already tried putting the same colour = class inside the aes of the geom_line, but it didn't work. I've also tried to use some techniques explained in How to assign colors to categorical variables in ggplot2 that have stable mapping? but I couldn't get either to work.

Can anyone help me with the correct colour specification, so that my last line of code uses the colours of the class-variable in the same way as the plot?

Thanks in advance!

Edit: If anyone else ever needs this: For Brandons solution to work you need to include the colour-defining variable in ml_list above.

New code line:

ml_list <- mapply(function(A, B) {
    list(data.frame( x = c(-Inf, Inf), y = mean(A), cutoff = factor(paste(B, "Mean Deviation", sep = " ")), class = factor(paste0(B))))
    }, dev_list, c(names(dev_list)))
1

1 Answers

1
votes

I think the trick is to keep things as a data.frame. Adding layers individually is a bit painful as you would also have to set the colours at each apply*

ml_list <- unique(do.call(rbind, ml_list)[-1])

p <- ggplot(df_melt, aes(variable, value)) + geom_point(aes(colour = class))
p + geom_hline(aes(yintercept=y, linetype = cutoff, color = cutoff), ml_list) + 
    scale_color_discrete(breaks = df_melt$class)

Setting scale_color_discrete() forces the presentation as, I think, needed for this plot.