0
votes

So I'm currently made a clustering for a dataset for Facebook and I put a label data for each row with each cluster that I have and the data frame looks like this

enter image description here

so I would like to plot the data into a stacked bar chart so I did group the data like

dfff=x_df.groupby("cluster")["page_type"].value_counts()

and the output like this

cluster  page_type 
0        government    5387
         company       3231
         politician    3149
         tvshow        1679
1        government     563
         company          9
         politician       2
2        company       3255
         politician    2617
         tvshow        1648
         government     930
Name: page_type, dtype: int64

so how can I plot this series into a stacked bar chart of 3 columns (0 ,1 ,2) which they are the cluster that I have?

1

1 Answers

1
votes
import pandas as pd
import matplotlib.pyplot as plt

# given dfff and a groupby dataframe
dfp = dfff.unstack()

# display(dfp)
page_type  company  government  politician  tvshow
id                                                
0           3231.0      5387.0      3149.0  1679.0
1              9.0       563.0         2.0     NaN
2           3255.0       930.0      2617.0  1648.0

# plot stacked bar
dfp.plot.bar(stacked=True)
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

enter image description here

Seaborn look

import matplotlib.pyplot as plt

# set style parameter
plt.style.use('seaborn')

# plot stacked bar
dfp.plot.bar(stacked=True)
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')

enter image description here