3
votes

How do I add multiple regression lines to the same plot in plotly?

I want to graph the scatter plot, as well as a regression line for each CATEGORY

The scatter plot plots fine, however the graph lines are not graphed correctly (as compared to excel outputs, see below)

df <-  as.data.frame(1:19)

df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)

df[,1] <- NULL

fv <- df %>%
  filter(!is.na(x)) %>%
  lm(x ~ y + y*CATEGORY,.) %>%
  fitted.values()

p <- plot_ly(data = df,
         x = ~x,
         y = ~y,
         color = ~CATEGORY,
         type = "scatter",
         mode = "markers"
) %>%
  add_trace(x = ~y, y = ~fv, mode = "lines")

p
  • Apologies for not adding in all the information beforehand, and thanks for adding the suggestion of "y*CATEGORY" to fix the parallel line issue.

Excel Output https://i.imgur.com/2QMacSC.png

R Output https://i.imgur.com/LNypvDn.png

1
Please create a reproducible example, including data or at the very least the output of fv. See this post for guidance: stackoverflow.com/questions/5963269/…emilliman5
Also, is it compulsory a plotly sintax?s__
Please use the r-plotly tag instead of plotly. Also you'll need to provide us with dput(df) or dput(head(df, 20)) (if it is too much data) so we can help.ismirsehregal
The lines should be parallel based on your model. What you need to add is an interaction to your model if you expect the slopes to be different in each category (e.g. lm(x ~ y + y*CATEGORY, .)emilliman5
@emilliman5 Thanks for that! I have added the new information to the original question, not sure if R regression line should match that in excel but I have linked both images in the question.Aaron Walton

1 Answers

3
votes

Try this:

library(plotly)
df <-  as.data.frame(1:19)

df$CATEGORY <- c("C","C","A","A","A","B","B","A","B","B","A","C","B","B","A","B","C","B","B")
df$x <- c(126,40,12,42,17,150,54,35,21,71,52,115,52,40,22,73,98,35,196)
df$y <- c(92,62,4,23,60,60,49,41,50,76,52,24,9,78,71,25,21,22,25)

df[,1] <- NULL

df$fv <- df %>%
  filter(!is.na(x)) %>%
  lm(y ~ x*CATEGORY,.) %>%
  fitted.values()

p <- plot_ly(data = df,
         x = ~x,
         y = ~y,
         color = ~CATEGORY,
         type = "scatter",
         mode = "markers"
) %>%
  add_trace(x = ~x, y = ~fv, mode = "lines")

p

enter image description here