2
votes

I'm trying to plot the data for various months, with different traces representing a year for data in a timeseries like dataframe. My desired output is an interactive version of the graph below, given a long data.frame. (data and code given below)

enter image description here

visual inspection of data -

          y year       time
1  15.70525 2005 0.41666667
2  15.82955 2005 0.50000000
3  17.55470 2005 0.58333333
4  18.10086 2005 0.66666667
5  17.49667 2005 0.75000000
6  19.34727 2005 0.83333333
7  20.03129 2005 0.91666667
8  23.48669 2006 0.00000000
9  12.53699 2006 0.08333333
10 15.46702 2006 0.16666667

A ggplotly solution exists, but I want to avoid it. Thus, in native plotly, how to -

  1. Plot multiple traces in a long data.frame with the data for each trace defined by a grouping variable.
  2. Define a custom colorscale on the colorbar.

I have a working solution for 1, but am not sue if it's the idomatic way in plotly; however, I'm completely stuck on 2.

I achieved point 1 by converting the data from long to wide and adding a trace for each column. For some predefined colours, my approach is below -

library(plotly)
yr.names <- as.character(unique(data[['year']]))
colours <- c("#F8766D", "#7CAE00", "#00BFC4", "#C77CFF")
data <- stats::reshape(
  data = data, direction = 'wide', idvar = c('time'), 
  timevar = 'year', v.names = 'y'
)
colnames(data)[2:ncol(data)] <- yr.names
data <- data[order(data[['time']]), ]
p <- plot_ly(data = data)
for (i in seq_along(yr.names)) {
  p <- add_trace(
    p = p, y = data[[yr.names[i]]], x = ~time, name = yr.names[i], 
    type = 'scatter', mode = 'lines', line = list(color = colours[i])
  )
}
p

This yields - enter image description here

Next, how do I tell plotly that year in my case is a continuous variable? In other words, how do I manually define a colorbar/colorscale that will do what I want it to do?

Here's my attempt that does not work -

nCol <- length(colours)
colourscale <- data.frame(
  y = seq(0, 1, length.out = nCol), 
  col = as.character(colours)
)
p <- plot_ly(
  x = seq(0, 1, length.out = nCol), type = 'scatter', mode = 'markers', 
  y = mean(data[[yr.names[1]]], na.rm = TRUE), hoverinfo = 'none', marker = list(
    size = rep(0, nCol), color = as.numeric(yr.names), colorscale = colourscale,
    colors = colours, 
    colorbar = list(
      title = 'Year', nticks = nCol
    )
  )
)
for (i in seq_along(yr.names)) {
  p <- add_trace(
    p = p, y = data[[yr.names[i]]], x = data[['time']], name = yr.names[i], 
    type = 'scatter', mode = 'lines+markers', line = list(color = colours[i]), 
    showlegend = FALSE
  )
}
p

This yields -

enter image description here

which is not obviously not correct. Any pointers?

Data for the question

data <- data.frame(
  y = c(
    15.705248, 15.82955, 17.554701, 18.100864, 17.496668, 19.347265, 
    20.031291, 23.486694, 12.536987, 15.467018, 14.233539, 17.783058, 
    16.291602, 16.980282, 18.612189, 16.623343, 21.430241, 23.575517, 
    23.334206, 28.038383, 16.763869, 19.792754, 16.427305, 21.000742, 
    20.681002, 21.83489, 23.93020353, 22.93035694, 23.26333992, 
    25.25003022, 25.80609, 29.665356, 21.654285, 18.264945, 23.107677, 
    22.91251, 19.43174
  ), year = c(
    2005, 2005, 2005, 2005, 2005, 2005, 2005, 2006, 2006, 2006, 2006, 
    2006, 2006, 2006, 2006, 2006, 2006, 2006, 2006, 2007, 2007, 2007, 
    2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2008, 2008, 
    2008, 2008, 2008, 2008
  ), time = c(
    0.416666666666667, 0.5, 0.583333333333333, 
    0.666666666666667, 0.75, 0.833333333333333, 0.916666666666667, 
    0, 0.0833333333333333, 0.166666666666667, 0.25, 0.333333333333333, 
    0.416666666666667, 0.5, 0.583333333333333, 0.666666666666667, 
    0.75, 0.833333333333333, 0.916666666666667, 0, 0.0833333333333333, 
    0.166666666666667, 0.25, 0.333333333333333, 0.416666666666667, 
    0.5, 0.583333333333333, 0.666666666666667, 0.75, 0.833333333333333, 
    0.916666666666667, 0, 0.0833333333333333, 0.166666666666667, 
    0.25, 0.333333333333333, 0.416666666666667
  )
)
1
@FloMei, the data is a timeseries like dataframe. I'm trying to plot the data for various months, with different traces representing a year.Ameya
Is there a specific reason why you want to avoid ggplotly?AndS.
Apart from speed, I've never found ggplotly outputs to be of the same quality as the native plotly outputs. (ggplot2 is good in itself though!)Ameya

1 Answers

0
votes

Here is a solution that is kind of a work around, but still works.

library(plotly)

p1 <- data %>% plot_ly(x = ~time, 
                 y = ~y, 
                 type = 'scatter', 
                 mode = 'lines', 
                 color = ~year,
                 marker = list(size = 1),
                 transforms = list(
                   list(
                     type = 'groupby',
                     groups = ~year,
                     styles = list(
                      list(target = 2005, value = list(marker =list(color = "#440154FF"))),
                      list(target = 2006, value = list(marker =list(color = "#3B528BFF"))),
                      list(target = 2007, value = list(marker =list(color = "#1FA187FF"))),
                      list(target = 2008, value = list(marker =list(color = "#9FDA3AFF")))
                    )
                   )
                 ))
p1

enter image description here

Although you said you want to avoid ggplotly, I find it much more straight forward

library(ggplot2)

p2 <- data %>% 
    ggplot(aes(time, y, group=year)) + 
    geom_line(aes(color = year)) +
    scale_color_viridis() + 
    theme_classic()
ggplotly(p2)

enter image description here