0
votes

I use ggplot to plot hundreds of simulated paths. The data has been organized by pivot_longer to look like this (200 simulated paths, each having 2520 periods; simulation 1 first, then simulation 2 etc., with ind showing the simulated values for each period):

sim period ind
1 0 100.0
1 1 99.66
. . .
1 2520 103.11
2 0 100.0
. . .
. . .
200 0 100.0
. . .
200 2520 195.11

Not sure if using pivot_long is optimal or not but at least the following ggplot looks fine:

p<-ggplot(simdata, aes(x=period, y=ind,color=sim, group=sim))+geom_line() 

producing a nice graph with paths in different shades of blue.

What I would like to do is to color the mean, median and quartile paths with different colours (e.g. red and green). Median, mean and quartile paths are defined by the last period's value. I already know the sim number for those. E.g. let's assume that median path is the one where sim = 160.

I have tried the following approaches.

  1. Add a new geom_line specifying the number (sim) of the median path:

    p + geom_line(aes(y = simdata[sim == 160,], color ="red")

This fails since the additional geom_line is not of the same length (200*2520) as the simdata - even if the graph's x-axis only has 2520 periods.

  1. Stat_summary

     p + stat_summary(aes(group=sim),fun=median, geom="line",colour="red")
    

The outcome was that all lines become read, also the simulated ones. Also, I rejected this since it takes a lot more time to have ggplot to find the mean, median etc. values rather than finding them before the graphics part.

  1. gghighlight

I experimented with this package but could not figure out if you can specify the path numbers to color.

1
maybe instead of plotting 200 lines, you might want to consider plotting only your summary statistics (e. g. a line for your mean), and then add error bands (e.g. with geom_ribbon) - tjebo

1 Answers

3
votes

Maybe try your first solution, but pass it to the data argument of geom_line instead:

p + geom_line(data = simdata[simdata$sim == 160,], color ="red")

As a quick example with some simulated data:

library(ggplot2)

df <- data.frame(a = rep(1:20, each = 100),
             b = rep(1:100, times = 20),
             c = rnorm(2000))

ggplot(df, aes(b, c, group = a)) +
  geom_line(colour = "grey") +
  geom_line(data = df[df$a==20,], colour = "red")

You can also pass a conditional as an argument in aes, which draws one line a colour specified by scale_colour_manual (tidier, adds legend, with labels which can be edited):

ggplot(df, aes(b, c, group = a, colour = a == 20)) +
  geom_line() +
  scale_colour_manual(values = c("TRUE" = "red", "FALSE" = "grey"))

Created on 2021-12-07 by the reprex package (v2.0.1)