4
votes

My goal is to create a stacked bar chart of a multilevel dataframe. The dataframe looks like this:

import pandas as pd
import numpy as np

arrays = [np.array(['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux', 'qux']),
          np.array(['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two', 'three'])]

s = pd.Series([10,20,10,22,10,24,10,26, 11], index=arrays)

In[1]: s

Out[1]: 
bar  one      10
     two      20
baz  one      10
     two      22
foo  one      10
     two      24
qux  one      10
     two      26
     three    11
dtype: int64

I have two goals:

  1. create a stacked bar chart such that the values are stacked to 4 single bins called bar, baz, foo, qux.

  2. the 4 bars should be ordered by size. In this example, the qux bar will have height (10+26+11=)47 and should be the first left, followed by the foo bar which has height (10+24)=34.

2

2 Answers

11
votes
  1. Sorting the first level index according to it's total sum:

s_sort = s.groupby(level=[0]).sum().sort_values(ascending=False)
s_sort
qux    47
foo    34
baz    32
bar    30
dtype: int64
  1. Reindex back using the new sorted index values in the first level + unstack + plot:

cmp = plt.cm.get_cmap('jet')
s.reindex(index=s_sort.index, level=0).unstack().plot.bar(stacked=True, cmap=cmp)

enter image description here

0
votes

One small addition to the game: we could sort at the inner index level by the values too

s1=s.groupby(level=[0]).apply(lambda x:x.groupby(level=[1]).sum().sort_values(ascending=False))
s1

The inner level now stands sorted.

bar  two      20
     one      10
baz  two      22
     one      10
foo  two      24
     one      10
qux  two      26
     three    11
     one      10
dtype: int64

Now we sort by the outer level in the already mentioned way.

s_sort = s1.groupby(level=[0]).sum().sort_values(ascending=False)
s2 = s1.reindex(index=s_sort.index, level=0)
s2

qux  two      26
     three    11
     one      10
foo  two      24
     one      10
baz  two      22
     one      10
bar  two      20
     one      10
dtype: int64

Unfortunately, matplotlib plays the spoil-sport by messing up the order of the stacked bars on its own X(

s2.unstack().plot.bar(stacked=True)

Stacked Bar Chart