I have a dataframe of N columns of values by M dates.
I'm looking to plot a stacked bar chart of the 3 largest values per date.
Test dataframe:
import pandas
import numpy
data = {
'A': [ 65, 54, 12, 14, 30, numpy.nan ],
'B': [ 54, 47, 60, 34, 40, 35 ],
'C': [ 34, 39, 57, 56, 48, numpy.nan ],
'D': [ 20, 18, 47, 47, 35, 70 ]
}
df = pandas.DataFrame(index=pandas.date_range('2018-01-01', '2018-01-06').date,
data=data,
dtype=numpy.float64)
A B C D 2018-01-01 65.0 54.0 34.0 20.0 2018-01-02 54.0 47.0 39.0 18.0 2018-01-03 12.0 60.0 57.0 47.0 2018-01-04 14.0 34.0 56.0 47.0 2018-01-05 30.0 40.0 48.0 35.0 2018-01-06 NaN 35.0 NaN 70.0
Extracting the 3 largest values per row:
I have found nlargest
which I can use to extract the 3 largest columns and their respective values for each row:
for date,row in df.iterrows():
top = row.nlargest(3)
s = [f'{c}={v}' for c,v in top.iteritems()]
print('{}: [ {} ]'.format(date, ', '.join(s)))
2018-01-01: [ A=65.0, B=54.0, C=34.0 ] 2018-01-02: [ A=54.0, B=47.0, C=39.0 ] 2018-01-03: [ B=60.0, C=57.0, D=47.0 ] 2018-01-04: [ C=56.0, D=47.0, B=34.0 ] 2018-01-05: [ C=48.0, B=40.0, D=35.0 ] 2018-01-06: [ D=70.0, B=35.0 ]
Plotting the data in a stacked bar chart:
The final step, to take the above data and plot a stacked bar chart so that it looks like the example below, I have been unsuccessful in.
I'm not even sure if nlargest
is the best approach.
Desired output:
Question:
How can I create a stacked bar chart of the N largest columns per row in a dataframe?