I am creating a series of boxplots in order to compare different cancer types with each other (based on 5 categories). For plotting I use seaborn/matplotlib. It works fine for most of the cancer types (see image right) however in some the x axis collapses slightly (see image left) or strongly (see image middle) https://i.imgur.com/dxLR4B4.png
Looking into the code how seaborn plots a box/violin plot https://github.com/mwaskom/seaborn/blob/36964d7ffba3683de2117d25f224f8ebef015298/seaborn/categorical.py (line 961)
violin_data = remove_na(group_data[hue_mask])
I realized that this happens when there are too many nans
Is there any possibility to prevent this collapsing by code only I do not want to modify my dataframe (replace the nans by zero)
Below you find my code:
boxp_df=pd.read_csv(pf_in,sep="\t",skip_blank_lines=False)
fig, ax = plt.subplots(figsize=(10, 10))
sns.violinplot(data=boxp_df, ax=ax)
plt.xticks(rotation=-45)
plt.ylabel("label")
plt.tight_layout()
plt.savefig(pf_out)
The output is a per cancer type differently sized plot (depending on if there is any category completely nan) I am expecting each plot to be in the same width.
Update trying to use the order parameter as suggested leads to the following output: https://i.imgur.com/uSm13Qw.png
Maybe this toy example helps ?
|Cat1|Cat2|Cat3|Cat4|Cat5
|3.93| |0.52| |6.01
|3.34| |0.89| |2.89
|3.39| |1.96| |4.63
|1.59| |3.66| |3.75
|2.73| |0.39| |2.87
|0.08| |1.25| |-0.27
Update Apparently, the problem is not the data but the length of the title https://github.com/matplotlib/matplotlib/issues/4413
Therefore I would close the question @Diziet should I delete it or does my issue might help other ones? Sorry for not including the line below in the code example:
ax.set_title("VERY LONG TITLE", fontsize=20)