1
votes

I am trying to create a bar chart about degrees awarded by colleges that shows two things: the first bar should show the total percentage of degrees awarded in the past year, and then bars 2, 3, and 4 should show the percent difference between the degrees awarded 5 / 10 / 15 years ago.

I am able to represent these as all starting from 0. But is there a way to align the percent difference to y = top of the current total percentage (bar 1) rather than y = 0?

The graph I have created now is this (using a basic pandas df.plot(kind = 'bar'):

bar_chart_all_y_0

What I want to do is move the yellow, green, and red bars (which show change over the 05, 10, 15 year increments) so that they start aligned with the top of the blue bars (on the far left of each group).

I have read the following threads:

How do I create a bar chart that starts and ends in a certain range

100% Stacked Bar Chart in MatPlotLib

Waterfall plot python?

but am unsure how to make the precise chart I want to create.

Huge thanks.

2
Do you have a link to a chart that looks like the one you have in mind? Im having trouble visualizing what it is you want to doWoody Pride
I've manually created this by creating arrays for each group of 4 I want to plot: link (I am having trouble adding the code I used to create this in an easy-to-read way - in short I created arrays for each group of 4 and computed the starting y value for each bar and then plotted the bar off of that starting y value using the format of ax.bar(x_alignment, bottom = y_start, height = bar_height). It may be that this is the best way to do it and iterate through each grouping I want to plot (and there isn't a good way to do through pandas).jbachlombardo
OK i added an answer that I think gets close to what you are trying to achieveWoody Pride

2 Answers

1
votes

I'd consider just doing it in Matplotlib.

See: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.bar.html

You will have to write a bit more code but you'll have a lot of flexibility. To make the growth/change bars start where the first bar ends, you can use the bottom argument.

You can see it in use in one of their examples here: https://matplotlib.org/gallery/lines_bars_and_markers/bar_stacked.html#sphx-glr-gallery-lines-bars-and-markers-bar-stacked-py

1
votes

Using a basic 2 college example I have done what you wanted by stacking the differences in year on year admissions on top of the absolute values of those admissions. If you are doing this for multiple colleges, obviously you would want to do this in a loop so you do not have to explicitly create every bar. This solution should give you the tools you need to create what it is you want. I would add though, that as a means of conventying information I do not find this graph very helpful. Personally I would just present year on year admission and let the eyes figure out how different they are, or put labels on that calls out the year on year difference.

import pandas as pd
from matplotlib import pyplot as plt

df = pd.DataFrame({'college1' : [0.14, 0.1, 0.12, 0.07],
                        'college2' : [0.14, 0.16, 0.18, 0.12]}).T
df.columns = ['today', '5years', '10years', '15years']
width = 0.5

#plot the today bars
p1 = plt.bar([1, 4], df['today'], width, color =['#0247fe', '#175d1e'])

#plot the difference bars
p2 = plt.bar([1.5, 4.5], df['today']- df['5years'], width,
             bottom=df['5years'], color=['#6363ff', '#427741'])
p3 = plt.bar([2, 5], df['today']- df['10years'], width,
             bottom = df['10years'], color = ["#8d81ff", "#679164"])
p4 = plt.bar([2.5, 5.5], df['today']- df['15years'], width,
             bottom = df['15years'], color=["#ae9fff", "#8cac88"])

#other controls
plt.xticks([1.25, 1.75, 2.25, 2.75, 4.25, 4.75, 5.25, 5.75], ['c1 today', '5yearDiff', '10yearDiff', '15yearDiff','c2 today', '5yearDiff', '10yearDiff', '15yearDiff'],
                                                                                        rotation='vertical')
plt.ylabel("Admissions")
plt.ylim(0, 0.2)
plt.xlim(0.75, 6)
plt.gcf().set_size_inches(8, 8)
plt.subplots_adjust(bottom=0.2)
plt.show()

enter image description here