4
votes

plotly_with_extra_diagonal_line

Plotly draws an extra diagonal line from the start to the endpoint of the original line graph.

Other data, other graphs work fine.

Only this data adds the line.

Why does this happen?

How can I fix this?

Below is the code

temp = pd.DataFrame(df[{KEY_WORD}])
temp['date'] = temp.index
fig=px.line(temp.melt(id_vars="date"), x='date', y='value', color='variable')
fig.show()
plotly.offline.plot(fig,filename='Fig_en1')
2

2 Answers

2
votes

Just had the same issue -- try checking for duplicate values on the X axis. I was using the following code:

fig = px.line(df, x="weekofyear", y="interest", color="year")
fig.show()

That created the following plot:

plot with extra lines

I realised that this was because in certain years, some of the week numbers for the dates I had pertained to the previous years' weeks 52/53 and therefore created duplicates e.g. index 93 and 145 below:


    date    interest    query   year    weekofyear
39  2015-12-20  44  home insurance  2015    51
40  2015-12-27  55  home insurance  2015    52
41  2016-01-03  69  home insurance  2016    53
92  2016-12-25  46  home insurance  2016    51
93  2017-01-01  64  home insurance  2017    52
144 2017-12-24  51  home insurance  2017    51
145 2017-12-31  79  home insurance  2017    52
196 2018-12-23  46  home insurance  2018    51
197 2018-12-30  64  home insurance  2018    52
248 2019-12-22  57  home insurance  2019    51
249 2019-12-29  73  home insurance  2019    52

By amending these (for week numbers that are high for dates in Jan, I subtracted 1 from the year column) I seem to have got rid of the phenomenon:

plot with no extra lines

NB: there may be some other differences between the charts due to the dataset being somewhat fluid.

1
votes

A similar question has been asked and answered in the post How to disable trendline in plotly.express.line?, but in your case I'm pretty sure the problem lies in temp.melt(id_vars="date"), x='date', y='value', color='variable'. It seems you're transfomring your data from a wide to a long format. You're using color='variable' without specifying that in temp.melt(id_vars="date"). And when the color specification does not properly correspond to the structure of your dataset, an extra line like yours can occur. Just take a look at this:

Command 1:

fig = px.line(data_frame=df_long, x='Timestamp', y='value', color='stacked_values')

Plot 1:

enter image description here

Command 2:

fig = px.line(data_frame=df_long, x='Timestamp', y='value')

Plot 2:

enter image description here

See the difference? That's why I think there's a mis-specification in your fig=px.line(temp.melt(id_vars="date"), x='date', y='value', color='variable').

So please share your data, or a sample of your data that reproduces the problem, and I'll have a better chance of verifying your problem.