0
votes

I have generated a grouped boxplots with Seaborn:

sns.boxplot(x="DATE", y="Rate", data=mydata)

I have 15 boxes for 15 different dates and now I'd like to add one more box to show overall distribution (that is the combination of all groups) into the same plot.

If I simply do this:

sns.boxplot(x="DATE", y="Rate", data=mydata)
sns.boxplot(y=mydata["Rate"])

I can generate a single plot showing all boxes but I cannot arrange my xticklabels properly. Is there a better way to add the combined boxplot? Alternatively, how can I set the xticklabels?

Thank you!

1

1 Answers

0
votes

Seaborn's boxplot doesn't seem to allow combining boxplots from 2 separate calls. However, you can use the underlying matplotlib boxplot to achieve the desired combination.

The first parameter to plt.boxplot is a list. Each entry of the list contains the dataset corresponding to the boxplot for that entry. So, a list is created with one dataset for each day, and a separate entry for the overall boxplot. The positions= parameter tells the x-position of each boxplot. patch_artist=True creates boxplots that can be filled (the default is a boxplot only containing lines). The color of the median can be changed, to be better visible depending on the colors chosen for the boxes.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

# create some toy data for 15 days
dates = pd.date_range('2020-10-01', freq='D', periods=15)
df = pd.DataFrame({'DATE': pd.to_datetime(np.random.choice(dates, 500)),
                   'Rate': np.random.uniform(2, 10, 500)})

fig, ax = plt.subplots(figsize=(12, 4))
ax.boxplot([df[df['DATE'] == d]['Rate'] for d in dates] + [df['Rate']],
           positions=range(len(dates)+1), patch_artist=True,
           medianprops={'color': 'navy'})
# assign colors as if they were set with seaborn
for box, color in zip(ax.artists, sns.color_palette('husl', len(ax.artists))):
    box.set_color(color)
# set the labels for the x-ticks
ax.set_xticklabels([str(d)[:10] for d in dates] + ['overall'], rotation=45)
# optionally add a vertical line to separate the special box
ax.axvline(len(dates) - 0.5, color='black', ls=':')
plt.tight_layout()
plt.show()

example plot

PS: Ordering and Formatting Dates on X-Axis in Seaborn Bar Plot shows a way to set xticks with dates in a seaborn barplot or boxplot.