12
votes

I create a simple pandas dataframe with some random values and a DatetimeIndex like so:

import pandas as pd
from numpy.random import randint
import datetime as dt
import matplotlib.pyplot as plt

# create a random dataframe with datetimeindex
dateRange = pd.date_range('1/1/2011', '3/30/2011', freq='D')
randomInts = randint(1, 50, len(dateRange))
df = pd.DataFrame({'RandomValues' : randomInts}, index=dateRange)

Then I plot it in two different ways:

# plot with pandas own matplotlib wrapper
df.plot()

# plot directly with matplotlib pyplot
plt.plot(df.index, df.RandomValues)

plt.show()

(Do not use both statements at the same time as they plot on the same figure.)

I use Python 3.4 64bit and matplotlib 1.4. With pandas 0.14, both statements give me the expected plot (they use slightly different formatting of the x-axis which is okay; note that data is random so the plots do not look the same): pandas 0.14: pandas plot

pandas 0.14: matplotlib plot

However, when using pandas 0.15, the pandas plot looks alright but the matplotlib plot has some strange tick format on the x-axis:

pandas 0.15: pandas plot

pandas 0.15: matplotlib plot

Is there any good reason for this behaviour and why it has changed from pandas 0.14 to 0.15?

2
A workaround is to call to_pydatetimes: plt.plot(df.index.to_pydatetimes(), df.RandomValues).joris
You probably meant to_pydatetime() (without 's', same typo in your answer below), then it works great.Dirk
Ah yes, indeed, thanks! Edited it in my answerjoris

2 Answers

22
votes

Note that this bug was fixed in pandas 0.15.1 (https://github.com/pandas-dev/pandas/pull/8693), and plt.plot(df.index, df.RandomValues) now just works again.


The reason for this change in behaviour is that starting from 0.15, the pandas Index object is no longer a numpy ndarray subclass. But the real reason is that matplotlib does not support the datetime64 dtype.

As a workaround, in the case you want to use the matplotlib plot function, you can convert the index to python datetime's using to_pydatetime:

plt.plot(df.index.to_pydatetime(), df.RandomValues)

More in detail explanation:

Because Index is no longer a ndarray subclass, matplotlib will convert the index to a numpy array with datetime64 dtype (while before, it retained the Index object, of which scalars are returned as Timestamp values, a subclass of datetime.datetime, which matplotlib can handle). In the plot function, it calls np.atleast_1d() on the input which now returns a datetime64 array, which matplotlib handles as integers.

I opened an issue about this (as this gets possibly a lot of use): https://github.com/pydata/pandas/issues/8614

2
votes

With matplotlib 1.5.0 this 'just works':

import pandas as pd
from numpy.random import randint
import datetime as dt
import matplotlib.pyplot as plt

# create a random dataframe with datetimeindex
dateRange = pd.date_range('1/1/2011', '3/30/2011', freq='D')
randomInts = randint(1, 50, len(dateRange))
df = pd.DataFrame({'RandomValues' : randomInts}, index=dateRange)

fig, ax = plt.subplots()
ax.plot('RandomValues', data=df)

demo image