3
votes

I am a bit confused how to use seaborn.stripplot() to plot multiple columns of data points when these data do not have "categorical" labels.

For example, users can plot "grouped" scatterplots as follows, with the tips dataset:

import numpy as np
import pandas as pd
import matplotlib
import matplotlib.pyplot as plt

import seaborn as sns

tips = sns.load_dataset("tips")   # internal dataset

print(tips)

     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4
5         25.29  4.71    Male     No   Sun  Dinner     4
....      .....      .....      .....

There are measurements grouped together by the category day, whereby we produce scatterplots as follows:

sns.stripplot(x="day", y="total_bill", data=tips)

enter image description here

Now, I would like to re-produce this "grouped scatterplot format" plot with non-categorical data, with data in each column:

df = pd.read_csv("my_data.csv")

df

      total_bill_A   total_bill_B    total_bill_C   total_bill_D      
0     16.99          21.01           15.99          14.50  
1     10.34          21.66           12.99          16.50  
2     21.01          23.50           7.25           17.50   
3     23.68          23.31           9.99           12.50 
4     24.59          23.61           10.00          15.50  
5     25.29          24.71           11.00          19.50   
....               ....

The y-axis here is price, and the x axis should be each of these columns, total_bill_A, total_bill_B, total_bill_C, and total_bill_D, similar to the above for Thursday, Friday, Saturday, Sunday.

How could I plot something like these seaborn? Is it possible to do this with seaborn.stripplot()?

1

1 Answers

6
votes

You can melt the dataframe and name the parameters accordingly to apply to the stripplot as follows:

df_strip = pd.melt(df, var_name='total_bill', value_name='price')
sns.stripplot(x="total_bill", y="price", data=df_strip)

Image