5
votes

I want to produce a plot from two series 'Pos' and 'Neg' (y values) from a data frame. The x value is in the 'Mean' column. I want the series to have different colour.

Searching stackoverflow gave me a similar question: change color for two geom_point() in ggplot2, but I want to use aes_string in order to avoid notes when checking the package.

I get it to work using aes and 'automatic' colours, as in the first example below. However, I can't figure out how to produce the same plot using aes_string and still let ggplot decide the colours. I feel that this should be a simple thing...

A reproducible example:

exData <- data.frame(Marker = rep("TH01", 10),
                 Mean = seq(1:10),
                 Neg = -1*runif(10,0.1,1),
                 Pos = runif(10,0.1,1))

# Produce the correct plot, with 'automatic' colours.
gp <- ggplot(exData, aes_string(x="Mean"),
             shape=val_shape, alpha=val_alpha)
gp <- gp + geom_point(aes(y=Pos, colour="Max"))
gp <- gp + geom_point(aes(y=Neg, colour="Min"))
gp <- gp + scale_colour_discrete(name = "Legend")
print(gp)

# Produce the correct plot, but not with 'automatic' colours.
gp <- ggplot(exData, aes_string(x="Mean"),
             shape=val_shape, alpha=val_alpha)
gp <- gp + geom_point(aes_string(y="Pos"), colour=1)
gp <- gp + geom_point(aes_string(y="Neg"), colour=2)
gp <- gp + scale_colour_discrete(name = "Legend")
print(gp)
2
Sorry, made a last change, but forgot one... now fixed.Oskar Hansson

2 Answers

5
votes

The way your data is formatted is not ideal for ggplot2. Convert it to the "long" format first:

library(reshape2)
exData.m <- melt(exData, id.vars=c("Marker", "Mean"))

ggplot(exData.m, aes(x=Mean, y=value, color=variable)) + geom_point()

Result of plot

As a rule of thumb, each aesthetic (x, y, color, shape, alpha, ...) requires a column in the data frame to be plotted. The reshape2 library is helpful here.

0
votes

To answer your question directly, in your first plot ggplot assembles all the color designations ("Max" and "Min" here) and treats them as a factor. Then it uses the default color palette, which is described beautifully in the response to this question, and also in the Cookbook for R. So "Max" is treated as the first color in a default palette that has two colors.

In the second plot, you are specifying color numbers (integers). In that case, ggplot reverts to the default R color palette, which can be seen as follows:

y <- 1:6
barplot(y,col=y)

To get default colors with aes_string(...) use the method described by @krlmlr.