1
votes

I am trying to make a boxplot of an 18 year record of monthly rainfall and flood frequency. i.e. each x tick is the month, and each x tick is associated with two boxplots, one of the rainfall and one of the flood frequency. So far I have managed to plot these using seaborn (see following code and image), however I do not know how to create the boxplot with two y axes, which I need because the scales for each variable differ.

The data looks like this (largest value of flood_freq in the dataset is 7, not shown here):

    Group   Rainfall    Flood_freq
0   Jan     115.679997  0
1   Jan     72.929999   0
2   Jan     39.719999   0
3   Jan     46.799999   1
4   Jan     54.989998   0
...
212 Dec     51.599998   0
213 Dec     45.359999   0
214 Dec     10.260000   0
215 Dec     52.709998   0

This is the code I have used:

dd=pd.melt(FBPdf,id_vars=['Group'],value_vars=['Rainfall','Flood_freq'],var_name='Data')
sns.boxplot(x='Group',y='value',data=dd,hue='Data')

Which results in this:

enter image description here

I have since looked on the seaborn documentation and it seems it does not permit 2 y axes (Seaborn boxplot with 2 y-axes). Is anyone able to offer potential alternatives for what I am trying to achieve? The solutions on the link above do not relate to this double-y-axis and grouped boxplot problem I have.

Thank you very much in advance!

1
Is there any particular reason why you want to use seaborn for this? This would be perfectly doable using just matplotlib and numpy.Thomas Kühn

1 Answers

5
votes

With some fake data and a little help from this tutorial and this answer, here a minimal example how to achieve what you want using only numpy and matplotlib:

from matplotlib import pyplot as plt
import numpy as np

rainfall = np.random.rand((12*18))*300
floods =   np.random.rand((12*18))*2

t = np.arange(0.01, 10.0, 0.01)
data1 = np.exp(t)
data2 = np.sin(2 * np.pi * t)

fig, ax1 = plt.subplots()

months = [
    'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
    'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec',
]


ax1.set_xlabel('month')
ax1.set_ylabel('rainfall', color='tab:blue')
res1 = ax1.boxplot(
    rainfall.reshape(-1,12), positions = np.arange(12)-0.25, widths=0.4,
    patch_artist=True,
)
for element in ['boxes', 'whiskers', 'fliers', 'means', 'medians', 'caps']:
    plt.setp(res1[element], color='k')

for patch in res1['boxes']:
    patch.set_facecolor('tab:blue')



ax2 = ax1.twinx()  # instantiate a second axes that shares the same x-axis
ax2.set_ylabel('floods', color='tab:orange')
res2 = ax2.boxplot(
    floods.reshape(-1,12), positions = np.arange(12)+0.25, widths=0.4,
    patch_artist=True,
)
##from https://stackoverflow.com/a/41997865/2454357
for element in ['boxes', 'whiskers', 'fliers', 'means', 'medians', 'caps']:
    plt.setp(res2[element], color='k')

for patch in res2['boxes']:
    patch.set_facecolor('tab:orange')

ax1.set_xlim([-0.55, 11.55])
ax1.set_xticks(np.arange(12))
ax1.set_xticklabels(months)

fig.tight_layout()  # otherwise the right y-label is slightly clipped
plt.show()

The result looks something like this:

result of above code

I think with a little fine tuning this can actually look quite nice.