6
votes

Following this answer's use of DateFormatter, I tried to plot a time series and label its x axis with years using pandas 0.15.0 and matplotlib 1.4.2:

import datetime as dt
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas.io.data as pdio
import scipy as sp

t1 = dt.datetime(1960, 1, 1)
t2 = dt.datetime(2014, 6, 1)
data = pdio.DataReader("GS10", "fred", t1, t2).resample("Q", how=sp.mean)

fig, ax1 = plt.subplots()
ax1.plot(data.index, data.GS10)
ax1.set_xlabel("Year")
ax1.set_ylabel("Rate (%)")
ax1.xaxis.set_major_formatter(mpl.dates.DateFormatter("%Y"))
fig.suptitle("10-yr Treasury Rate", fontsize=14)

fig.savefig('test.eps')

The final line throws an error: OverflowError: Python int too large to convert to C long with this traceback:

C:\Anaconda3\lib\site-packages\IPython\core\formatters.py:239: FormatterWarning: Exception in image/png formatter: Python int too large to convert to C long FormatterWarning, Traceback (most recent call last):

File "", line 1, in runfile('D:/username/latex_template/new_pandas_example.py', wdir='D:/username/latex_template')

File "C:\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 580, in runfile execfile(filename, namespace)

File "C:\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 48, in execfile exec(compile(open(filename, 'rb').read(), filename, 'exec'), namespace)

File "D:/username/latex_template/new_pandas_example.py", line 18, in fig.savefig('test.eps')

File "C:\Anaconda3\lib\site-packages\matplotlib\figure.py", line 1470, in savefig self.canvas.print_figure(*args, **kwargs)

File "C:\Anaconda3\lib\site-packages\matplotlib\backend_bases.py", line 2194, in print_figure **kwargs)

File "C:\Anaconda3\lib\site-packages\matplotlib\backends\backend_ps.py", line 992, in print_eps return self._print_ps(outfile, 'eps', *args, **kwargs)

File "C:\Anaconda3\lib\site-packages\matplotlib\backends\backend_ps.py", line 1020, in _print_ps **kwargs)

File "C:\Anaconda3\lib\site-packages\matplotlib\backends\backend_ps.py", line 1110, in _print_figure self.figure.draw(renderer)

File "C:\Anaconda3\lib\site-packages\matplotlib\artist.py", line 59, in draw_wrapper draw(artist, renderer, *args, **kwargs)

File "C:\Anaconda3\lib\site-packages\matplotlib\figure.py", line 1079, in draw func(*args)

File "C:\Anaconda3\lib\site-packages\matplotlib\artist.py", line 59, in draw_wrapper draw(artist, renderer, *args, **kwargs)

File "C:\Anaconda3\lib\site-packages\matplotlib\axes_base.py", line 2092, in draw a.draw(renderer)

File "C:\Anaconda3\lib\site-packages\matplotlib\artist.py", line 59, in draw_wrapper draw(artist, renderer, *args, **kwargs)

File "C:\Anaconda3\lib\site-packages\matplotlib\axis.py", line 1114, in draw ticks_to_draw = self._update_ticks(renderer)

File "C:\Anaconda3\lib\site-packages\matplotlib\axis.py", line 957, in _update_ticks tick_tups = [t for t in self.iter_ticks()]

File "C:\Anaconda3\lib\site-packages\matplotlib\axis.py", line 957, in tick_tups = [t for t in self.iter_ticks()]

File "C:\Anaconda3\lib\site-packages\matplotlib\axis.py", line 905, in iter_ticks for i, val in enumerate(majorLocs)]

File "C:\Anaconda3\lib\site-packages\matplotlib\axis.py", line 905, in for i, val in enumerate(majorLocs)]

File "C:\Anaconda3\lib\site-packages\matplotlib\dates.py", line 411, in call dt = num2date(x, self.tz)

File "C:\Anaconda3\lib\site-packages\matplotlib\dates.py", line 345, in num2date return _from_ordinalf(x, tz)

File "C:\Anaconda3\lib\site-packages\matplotlib\dates.py", line 225, in _from_ordinalf dt = datetime.datetime.fromordinal(ix)

OverflowError: Python int too large to convert to C long

Am I using DateFormatter incorrectly here? How can I easily put years (or any time format, since my time series might differ) on the a-axis of a matplotlib figure?

1

1 Answers

12
votes

This is a 'regression' in pandas 0.15 (due to the refactor of Index), see https://github.com/matplotlib/matplotlib/issues/3727 and https://github.com/pydata/pandas/issues/8614, but is fixed in 0.15.1.


Short story: matplotlib now sees the pandas index as an array of datetime64[ns] values (which are actually very large int64s), instead of an array of Timestamps (which are subclass of datetime.datetime, and can be handled by matplotlib) in previous versions of pandas. So the underlying reason is that matplotlib does not handle datetime64 as date values but as ints.

For pandas 0.15.0 (but better upgrade to a newer version), there are two possible workarounds:

  • Register the datetime64 type, so it will also be handled as a date by matplotlib:

    units.registry[np.datetime64] = pd.tseries.converter.DatetimeConverter()
    
  • Or convert the DatetimeIndex (with datetime64 values) to an array of datetime.datetime values with the to_pydatetime method, and plot this:

    ax1.plot(data.index.to_pydatetime(), data.GS10)
    

related question: Plotting datetimeindex on x-axis with matplotlib creates wrong ticks in pandas 0.15 in contrast to 0.14