15
votes

I have a dataset called "merged", which contains 3 numeric columns "pauseMedian" and "numTotalPauses" and "diff". I also have a splineHull dataset, which also contains numeric columns "pauseMedian" and "numTotalPauses", plus a 6-level factor "microstyle"

I have the following code, which works perfectly. It plots a scatter plop and then overlay it with splineHull polygons colored according to "microstyle".

script 1:

ggplot(data=merged,aes(x = pauseMedian, y = numTotalPauses)) 
       + geom_point()  
       + geom_polygon(data = splineHull, 
                      mapping=aes(x=pauseMedian, 
                                  y=numTotalPauses, 
                                  group=microstyle, 
                                  color = microstyle),
                       alpha=0)

Then, I also want to change the color of the points in the scatter plot by adding just one attribute color = diff.

script 2:

ggplot(data=merged,aes(x = pauseMedian, y = numTotalPauses, color = diff)) 
       + geom_point()  
       + geom_polygon(data = splineHull, 
                      mapping=aes(x=pauseMedian, 
                                  y=numTotalPauses, 
                                  group=microstyle, 
                                  color = microstyle),
                       alpha=0)

I see the following error:

Error: Discrete value supplied to continuous scale

I don't know why I see this error. If I still want colored scatter plot but no polygons, I run the following code it works again.

script 3:

ggplot(data=merged,aes(x = pauseMedian, y = numTotalPauses, color = diff)) 
       + geom_point()  

So, what happened with script 2, where is the error from, and how can I make it work?

3
That does seem weird - it's hard to pinpoint the error without seeing example data. Also, have you tried moving color=diff to geom_point(aes(color=diff))?Señor O
@SeñorO Hi, yes I tried that. It gave the same error. In script 2, there are two color attributes, one in the ggplot aes, the other in the geom_polygon aes. The former is assigned a numeric value "diff", the latter is assigned a factor value "microstyle". I guess, maybe ggplot cannot handle a numeric color and a factor color at the same time?nan
Actually that may be correct now that I think about it - because it needs to make a legend for the color. Try using fill = microstyle for polygonSeñor O
You don't need to post your whole dataset. Just post a small sample that will allow us to reproduce the problem using your data. For example, post of the output of dput(merged[sample(1:nrow(merged),20),]). That will give 20 randomly selected rows of your data (do the same for splineHull).eipi10
Using some fake data that (I hope) is similar to your data, I was able to get the same error message. However, when I reverse the order of geom_polygon and geom_point, I get Error: Continuous value supplied to discrete scale. There seems to be a conflict between the color scales of the two geoms, one being discrete, the other being continuous, but I'm not sure why that's happening. I would have thought having two separate geoms would result in two separate colour scales.eipi10

3 Answers

18
votes

Evidently, you can't have different color aesthetics for two different geoms. As a workaround, use a fill aesthetic for the points instead. This means you have to use a point marker style that has a filled interior (see ?pch and scroll down for the available point styles). Here's a way to do that:

ggplot() + 
  geom_point(data=merged,aes(x = pauseMedian, y = numTotalPauses, fill = diff),
             pch=21, size=5, colour=NA) +
  geom_polygon(data = splineHull, 
               mapping=aes(x=pauseMedian, 
                           y=numTotalPauses, 
                           colour = microstyle),
               alpha=0) 

Adding colour=NA (outside of aes()), gets rid of the default black border around the point markers. If you want a colored border around the points, just change colour=NA to whatever colour you prefer.

Also see this thread from the ggplot2 Google group, discussing a similar problem and some workarounds.

4
votes

Now that we know the two color vars are of different types, there's the issue. You can try using a different scale for one (e.g. fill instead of color)

set.seed(123)
my_df1 <- data.frame(a=rnorm(100), b=runif(100), c=rep(1:10, 10))
my_df2 <- data.frame(a=rnorm(100), b=runif(100), c=factor(rep(LETTERS[1:5], 20)))
 
# this won't work. can't assign discrete and continuous to same scale
ggplot() +
  geom_point(data=my_df1, aes(x=a, y=b, color=c)) +
  geom_polygon(data=my_df2, aes(x=a, y=b, color=c), alpha=0.5)

Error: Discrete value supplied to continuous scale

# but use fill for polygons, and that works:
ggplot() +
  geom_point(data=my_df1, aes(x=a, y=b, color=c)) +
  geom_polygon(data=my_df2, aes(x=a, y=b, fill=c), alpha=0.5)

plot output

If you have to use the same scale (color), and can't convert the variables to the same type, see this for more info: Plotting continuous and discrete series in ggplot with facet

2
votes

Just to add something to the preferred eipi10 answer above (thank you for that!!). The colour="NA" option in order to get rid of the border around the circle shape (pch=21) has to be between "". If you use colour=NA (without the quotation marks) the whole shape disappears and is not plotted. I would have just commented on the answer but I still don't have reputation for that :)