15
votes

I'd like to plot a factorplot in seaborn but manually provide the error bars instead of having seaborn calculate them.

I have a pandas dataframe that looks roughly like this:

     model output feature  mean   std
0    first    two       a  9.00  2.00
1    first    one       b  0.00  0.00
2    first    one       c  0.00  0.00
3    first    two       d  0.60  0.05
...
77   third   four       a  0.30  0.02
78   third   four       b  0.30  0.02
79   third   four       c  0.10  0.01

and I'm outputting a plot that looks roughly like this: seaborn bar plots

I'm using this seaborn commands to generate the plot:

g = sns.factorplot(data=pltdf, x='feature', y='mean', kind='bar',
                   col='output', col_wrap=2, sharey=False, hue='model')
g.set_xticklabels(rotation=90)

However, I can't figure out how to have seaborn use the 'std' column as the error bars. Unfortunately, it would be quite time consuming to recompute the output for the data frame in question.

This is a little similar to this q: Plotting errors bars from dataframe using Seaborn FacetGrid

Except I can't figure out how to get it to work with the matplotlib.pyplot.bar function.

Is there a way to do this using seaborn factorplot or FacetGrid combined with matplotlib?

Thanks!

2
I think the linked question is going to be the best way to go. plt.bar has a yerr parameter that should help. - mwaskom
Thanks @mwaskom, any tips on how to get it to go? currently the following code chokes: g = sns.FacetGrid(data=pltdf, col='output', col_wrap=6, sharey=False, hue='model') g.map(plt.bar, 'feature', 'mean', yerr='std') - crackedegg
apologies for the messy code, can't seem to get it to format nicely in the comment section. - crackedegg
I think the issue is that the yerr parameter in bar is not a positional one, it's a kwarg. - crackedegg
See the rules for mappable functions. You'll need to write a thin wrapper around plt.bar that accepts yerr as a positional argument. - mwaskom

2 Answers

9
votes

You could do something like

import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import sem
tips = sns.load_dataset("tips")

tip_sumstats = (tips.groupby(["day", "sex", "smoker"])
                     .total_bill
                     .agg(["mean", sem])
                     .reset_index())

def errplot(x, y, yerr, **kwargs):
    ax = plt.gca()
    data = kwargs.pop("data")
    data.plot(x=x, y=y, yerr=yerr, kind="bar", ax=ax, **kwargs)

g = sns.FacetGrid(tip_sumstats, col="sex", row="smoker")
g.map_dataframe(errplot, "day", "mean", "sem")

enter image description here

0
votes

Here is another approach:

import matplotlib.pyplot as plt
import numpy as np

plt.plot(np.asarray([[0, 0], [1, 1]]).T, np.asarray([[0.3, 0.4], [0.01 , 0.02]]).T)
plt.show()

The x values correspond the categorical values of the bar chart (0 is the first category and so on). The y values show the upper and lower limits of the error bars. Both arrays must be transposed for matplotlib to display them correctly. I just find it to be more readable this way.

Error bars