2
votes

I'd like to create a grouped bar chart that shows a customized Date-Time Index - just showing Month and year instead of the full dates. I want the bars to be grouped and not stacked.

I assumed pandas could handle this easily, using:

import pandas as pd
import matplotlib.pylab as plt
import matplotlib.dates as mdates

testdata = pd.DataFrame({"A": [1, 2, 3]
                       ,"B": [2, 3, 1]
                       , "C": [2, 3, 1]}  
                       ,index=pd.to_datetime(pd.DatetimeIndex(
                            data=["2019-03-02", "2019-04-01","2019-05-01"])))
ax = testdata.plot.bar()

This creates the plot that I want, I'd just like to change to date into something more simple, like March 2019, April 2019, May 2019.grouped Bar charts but the x-Axis labels suck

I assumed using a Custom Date Formatter would work, therefore I tried

ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))

But than my labels are gone completely. And this question implies that pandas and the DateFormatter have a bit of a difficult relationship. Therefore I tried to do it with Matplotlib basics:

fig, ax = plt.subplots()
width = 0.8
ax.bar(testdata.index, testdata["A"]) 
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()

Now the date representation is as expected (although the whitespace could be reduced), but the data overlap, which doesn't help. enter image description here

Defining a width and subtracting it from the x values (as suggested normally) won't help due to the DateTime-Index I use. I get an error that subtracting DatetimeIndes and float is unsupported.

fig, ax = plt.subplots()
width = 0.8
ax.bar(testdata.index-width, testdata["A"]) 
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index+width, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()

So now I'm running out of ideas and hope for input

2

2 Answers

2
votes

The reason ax.xaxis.set_major_locator(mdates.MonthLocator()) fails because under the hood, pandas plots the bars against range(len(df)), then rename the ticks accordingly.

You can grab the xticklabels after you plot, and reformat it:

ax = testdata.plot.bar()

ticks = [tick.get_text() for tick in ax.get_xticklabels()]
ticks = pd.to_datetime(ticks).strftime('%b %Y')
ax.set_xticklabels(ticks)

which gives the same result as ImpotanceOfBeingErnest's:

enter image description here

Another, probably better way is to shift the bars of each columns. This works better when you have many columns and want to reduce the number of xticks.

fig, ax = plt.subplots()

# define the shift
shift = pd.to_timedelta('1D')

# modify the base of each columns, can do with a for loop
ax.bar(testdata.index + shift, testdata["A"]) 
ax.bar(testdata.index, testdata["B"])
ax.bar(testdata.index - shift, testdata["C"])
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
plt.show()

Output:

enter image description here

3
votes

Pandas barplots are categorical. So maybe you're overthinking this and just want to use the string you want to see as a category label on the axis as index?

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({"A": [1, 2, 3]
                       ,"B": [2, 3, 1]
                       , "C": [2, 3, 1]}  
                       ,index=pd.to_datetime(pd.DatetimeIndex(
                            data=["2019-03-02", "2019-04-01","2019-05-01"])))

df.index = [d.strftime("%b %Y") for d in df.index]
ax = df.plot.bar()
plt.show()

enter image description here