1
votes

So when I have to plot a lot of lines I can differentiate them by color or linetype

library(ggplot2)
pd = cbind.data.frame(x = rep(c(1,2), each = 4), 
              y = rep(1:4, times=2), 
              type = rep(c("A", "B", "C", "D"), times=2))
ggplot(pd, aes(x=x, y=y, col=type)) + geom_line() + theme_bw()
ggplot(pd, aes(x=x, y=y, lty=type)) + geom_line() + theme_bw()

giving:

enter image description hereenter image description here

but I want both and I want colors and linetypes to be chosen automatically (i.e. I don't want to specify manually color and linetype for each type as in this question: ggplot2 manually specifying color & linetype - duplicate legend).

Here is an example of what my desired output could look like (plus an automatically generated legend):

enter image description here

ideal would be a command like

ggplot(pd, aes(x=x, y=y, style=type)) + geom_line() + theme_bw()

but I guess there will need to be a workaround.

P.S: the motivation of this is that 20+ lines can be hard to differentiate by color or linetype alone, this is why I'm looking for a combination of both. So the dashed red line is different from the solid red line and both of those yet different from the solid blue line. And I don't want to specify and choose colors and linetypes myself each time I feed my data to ggplot.

2
When you say "color or line type", do you mean not both? For example, ggplot(pd, aes(x=x, y=y, lty=type, col=type)), which is four line types and four colors.nrussell
So, to be clear, in your third plot there is no special connection between the 2 blue lines, you just want to use 2 colors and 2 linetypes to get 2*2 = 4 distinct categories?Gregor Thomas
@Gregor exactly, and I added a motivation to that in the P.S. to my questionmts
Creating the plot would be rather easy, just takes some dummy variables in the data. Getting an accurate, concise legend, however, will be very difficult.Gregor Thomas

2 Answers

2
votes

I know you wanted to avoid manual specification, but it doesn't have to be anywhere near as daunting as the example you linked to. Example below for 20 lines (color_lty_cross can accommodate up to 60):

library(ggplot2)

pd = cbind.data.frame(x = rep(c(1,2), each = 20), 
                      y = rep(1:20, times=2), 
                      type = rep(LETTERS[1:20], times=2))

# this function regenerates ggplot2 default colors
gg_color_hue <- function(n) {
  hues = seq(15, 375, length = n + 1)
  hcl(h = hues, l = 65, c = 100)[1:n]
}

color_lty_cross = expand.grid(
  ltypes = 1:6,
  colors = gg_color_hue(10)
  ,stringsAsFactors = F)


ggplot(pd, aes(x=x, y=y, col=type, lty = type)) + 
  geom_line() + 
  scale_color_manual(values = color_lty_cross$colors[1:20]) +
  scale_linetype_manual(values = color_lty_cross$ltypes[1:20]) +
  theme_bw()

You could quickly map to type with a merge You could use use simpler colors then gg_color_hue which maps ggplot2 defaults.

enter image description here

0
votes

You also can differentiate lines by geometric figures like triangles or circles. If you have few data (like 100 samples) you can simply choose type = "b" inside plot properties, but if you have a lot of data (like 10 thousand samples) you can plot the geometric figure over the generated line so that you can space it as you like by using points() command where you can specify x and y coordinates of the point. Otherwise figures will overlap each other. Example (with code):

enter image description here

#!/usr/bin/Rscript

args = commandArgs(trailingOnly=TRUE)
dataLocation = args[1]
fileName = args[2]
abs_path = args[3]

myData <- read.csv(dataLocation, sep = ";")

#paste0 concatenates two strings
fullFileName = paste0(abs_path,fileName)

#plot first line

png(fullFileName, width = 1300, height = 600)

plot(myData[,1], type = "l", cex = 2, cex.axis = 3.0, lty = 1 ,cex.lab = 3.0, ylim = c(0, max(myData)), xlab = "Iteração", ylab = "Valor")

limit = ncol(myData)

if(limit > 1){

    for (column in 2:limit){
        lines(myData[,column], lty = 1, type = "l")
    }
}

for(i in 1:500){

   if(i %% 100 == 0){

      points(i, myData[i,1], pch = 16, cex = 1.5)
      points(i, myData[i,2], pch = 24, cex = 1.5)
      points(i, myData[i,3], pch = 0, cex = 1.5)
      points(i, myData[i,4], pch = 15, cex = 1.5)

   }

}

dev.off()