Consistency with date plotting in pandas dataframe

Question

Pandas sometimes decides to plot DataFrames with timeindex in different ways.

I am plotting a pandas timeseries DataFrame using df.plot() and getting different behaviors for different columns as how the dates are shown and don't understand why. I am plotting data for the 18 June, so sometimes the x axis shows the hours 06:00, 08:00,... and sometimes the date/hour in a very confusing way: 06-18 06, 06-18 08, ... Why ?

Impossible to answer without data and code to reproduce this — EdChum
I understand, I think it comes from NaN present for some timeindexes in the second graph. — tcapelle
Then you need to consider what to do with those index values, drop them, fill them etc.. — EdChum

ImportanceOfBeingErnest ImportanceOfBeingErnest · Accepted Answer · 2019-09-11T11:29:04

Let's create a minimal example. The data is equally spaced with exactly 5 hours in between (5h00, 10h00, 15h00).

import pandas as pd
import matplotlib.pyplot as plt

index = pd.to_datetime(["2019-09-11 05:00:00", 
                        "2019-09-11 10:00:30",
                        "2019-09-11 15:00:00"])

pd.DataFrame({"x" : [1,2,4], "y" : [3,4,4]}, index=index).plot()
plt.show()

It will result in this plot:

Now, lets add 30 seconds to one of the datetimes,

index = pd.to_datetime(["2019-09-11 05:00:00",
                        "2019-09-11 10:00:30",  # <-- added 30 seconds here
                        "2019-09-11 15:00:00"])

now the data isn't equally spaced any more, and the result is this:

So in the latter case pandas does not consider it as "ts_plot". "ts" presumably stands for time series, but one could argue that both are time series anyways. But of course the latter case cannot be resampled - so that seems the underlying distinction.
Unfortunately, pandas ties the formatter to this kind of time series, and it cannot be changed manually.

You can get consistent results by putting x_compat=True into the plot function. This will make sure no "ts"-like axes is used independent of the data. It will always result in the second kind of plot.

df.plot(x_compat=True)

The advantage of this is that you can manually change the format of those normal plots via matplotlib.dates formatters and locators.

Consistency with date plotting in pandas dataframe

2 Answers