0
votes

I want to add HDI (High Density Intervals) that I computed (columns hdi_both, hdi_one, and lower_upper in the df below) to the bar plot.

However, I cannot figure out how to add error bars/CI such that each error bar has a customized upper and lower bounds that are independent from the y value (in this case the corresponding value in proportion_correct).

For example, the HDI interval for Exp. 1 with guesses_correct both has lower bound of 0.000000 and upper bound of 0.130435 and the proportion_correct is 0.000000.

All the options I saw include specifying upper and lower bounds relative to the value on the y axis, which is not what I'm looking for.

Any help will be greatly appreciated.

Thanks,

Ayala

import os
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

df = pd.DataFrame({
 'exp': ['Exp. 1', 'Exp. 1', 'Exp. 2', 'Exp. 2', 'Exp. 3', 'Exp. 3', 'Exp. 4', 'Exp. 4', 'Exp. 5', 'Exp. 5',
 'Collapsed', 'Collapsed'],
 'proportion_correct': [0.0, 0.304347826, 0.058823529000000006, 0.31372549, 0.047619048, 0.333333333, 0.12244898, 0.428571429, 0.12244898, 0.367346939, 0.082901554, 0.35751295299999997],
 'guesses_correct': ['both', 'one', 'both', 'one', 'both', 'one', 'both', 'one', 'both', 'one', 'both', 'one'],
 'hdi_both': [0.0, 0.130434783, 0.0, 0.078431373, 0.0, 0.1, 0.0, 0.08, 0.0, 0.081632653, 0.005181347, 0.051813472],
 'hdi_one': [0.130434783, 0.47826087, 0.156862745, 0.41176470600000004, 0.1, 0.5, 0.16, 0.4, 0.163265306, 0.408163265, 0.21761658, 0.341968912],
 'lower_upper': ['lower', 'upper', 'lower', 'upper', 'lower', 'upper', 'lower', 'upper', 'lower', 'upper', 'lower', 'upper']
})

print(df.head())
Out[4]: 
      exp  proportion_correct guesses_correct  hdi_both   hdi_one lower_upper
0  Exp. 1            0.000000            both  0.000000  0.130435       lower
1  Exp. 1            0.304348             one  0.130435  0.478261       upper
2  Exp. 2            0.058824            both  0.000000  0.156863       lower
3  Exp. 2            0.313725             one  0.078431  0.411765       upper
4  Exp. 3            0.047619            both  0.000000  0.100000       lower
# Make bar plot
sns.barplot(x='exp',
            y='proportion_correct',
            hue='guesses_correct',
            data=df)

plt.ylim([0, 0.5])
plt.xlabel('Experiment')
plt.ylabel('Proportion Correct')
plt.legend(title='Correct guesses', loc='upper right')
plt.axhline(y=0.277777, color='dimgray', linestyle='--')
plt.annotate(' chance\n one', (5.5, 0.27))
plt.axhline(y=0.02777, color='dimgray', linestyle='--')
plt.annotate(' chance\n both', (5.5, 0.02))
# Show the plot
plt.show()

This is the bar plot for which I want to add the HDI enter image description here

2

2 Answers

1
votes

I ended up plotting vertical lines as error bars. Here is my code in case it will help someone.

df = pd.DataFrame({'exp': ['Exp. 1', 'Exp. 1', 'Exp. 2', 'Exp. 2', 'Exp. 3', 'Exp. 3', 'Exp. 4', 'Exp. 4', 'Exp. 5', 'Exp. 5', 'Collapsed', 'Collapsed'],
                   'proportion_correct': [0.0, 0.304347826, 0.058823529000000006, 0.31372549, 0.047619048, 0.333333333, 0.12244898, 0.428571429, 0.12244898, 0.367346939, 0.082901554, 0.35751295299999997],
                   'guesses_correct': ['both', 'one', 'both', 'one', 'both', 'one', 'both', 'one', 'both', 'one', 'both', 'one'], 
                   'hdi_low': [0.0, 0.130434783, 0.0, 0.156862745, 0.0, 0.1, 0.0, 0.16, 0.0, 0.163265306, 0.005181347, 0.21761658],
                   'hdi_high': [0.130434783, 0.47826087, 0.078431373, 0.41176470600000004, 0.1, 0.5, 0.08, 0.4, 0.081632653, 0.408163265, 0.051813472, 0.341968912]
                  })
df.head()
Out[4]: 
  exp  proportion_correct guesses_correct   hdi_low  hdi_high
0  Exp. 1            0.000000            both  0.000000  0.130435
1  Exp. 1            0.304348             one  0.130435  0.478261
2  Exp. 2            0.058824            both  0.000000  0.078431
3  Exp. 2            0.313725             one  0.156863  0.411765
4  Exp. 3            0.047619            both  0.000000  0.100000

The following axvlines and axhlines functions were taken from How to draw vertical lines on a given plot in matplotlib. I don't write them here for clarity.

    # Make bar plot
    x_col = 'exp'
    y_col = 'proportion_correct'
    hue_col = 'guesses_correct'
    low_col = 'hdi_low'
    high_col = 'hdi_high'
    plot = sns.barplot(x=x_col,
                y=y_col,
                hue=hue_col,
                data=df)
    plt.ylim([0, 0.55])
    plt.yticks([0, 0.1, 0.2, 0.3, 0.4, 0.5], [0, 0.1, 0.2, 0.3, 0.4, 0.5])
    plt.xlabel('Experiment')
    plt.ylabel('Proportion Correct')
    plt.legend(title='Correct guesses', loc='upper right')
    plt.axhline(y=0.277777, color='dimgray', linestyle='--')
    plt.annotate(' chance\n one', (5.65, 0.27))
    plt.axhline(y=0.02777, color='dimgray', linestyle='--')
    plt.annotate(' chance\n both', (5.65, 0.02))
    lims_x = list(map(lambda x, y: (x, y), df[low_col].to_list(), df[high_col].to_list()))
    xss = [-0.2, 0.2, 0.8, 1.2, 1.8, 2.2, 2.8, 3.2, 3.8, 4.2, 4.8, 5.2]
    yss = [i for sub in lims_x for i in sub]
    lims_y = [(-0.3, -0.1), (-0.3, -0.1), (0.1, 0.3), (0.1, 0.3), (0.7, 0.9), (0.7, 0.9), (1.1, 1.3), (1.1, 1.3),
              (1.7, 1.9), (1.7, 1.9), (2.1, 2.3), (2.1, 2.3), (2.7, 2.9), (2.7, 2.9), (3.1, 3.3),  (3.1, 3.3),
              (3.7, 3.9), (3.7, 3.9), (4.1, 4.3), (4.1, 4.3), (4.7, 4.9), (4.7, 4.9), (5.1, 5.3), (5.1, 5.3)]
    for xs, lim in zip(xss, lims_x):
        plot = axvlines(xs, lims=lim, color='black')
    for yx, lim in zip(yss, lims_y):
        plot = axhlines(yx, lims=lim, color='black')
    plt.show()

And this is the plot enter image description here

0
votes

Although you have calculated the lower and upper bounds of your errorbars in absolute value, they are generally considered to be lower and upper errors around a particular y-value. But it's easy to calculate the "relative" lengths of the error bars by subtracting the y-value from the bounds you calculated.

You can then use plt.errorbar() to plot. Note that to use this function, all error values must be positive.

Since you are using a hue= split, you have to iterate through the different levels of hue, and take into account the shift of the bars (by default -0.2 and +0.2 for two levels of hue):

# Make bar plot
x_col = 'exp'
y_col = 'proportion_correct'
hue_col = 'guesses_correct'
low_col = 'hdi_both'
high_col = 'hdi_one'
sns.barplot(x=x_col,
            y=y_col,
            hue=hue_col,
            data=df)

for (h,g),pos in zip(df.groupby(hue_col),[-0.2,0.2]):
    err = g[[low_col, high_col]].subtract(g[y_col], axis=0).abs().T.values
    x = np.arange(len(g[x_col].unique()))+pos
    plt.errorbar(x=x, y=g[y_col], yerr=err, fmt='none', capsize=5, ecolor='k')

enter image description here