I'm learning the very basics of data science and started with regression analysis. So I decided building a linear regression model to examine the linear relationship between two variables (chemical_1
and chemical_2
) from this dataset.
I made chemical_1
the predictor (independent variable) and chemical_2
the target (dependent variable). Then used scipy.stats.linregress
to calculate a regression line.
from scipy import stats
X = df['chemical_1']
Y = df['chemical_2']
slope, intercept, r_value, p_value, slope_std_error = stats.linregress(X,Y)
predict_y = slope * X + intercept
I figured out how to plot the regression line with matplotlib.
plt.plot(X, Y, 'o')
plt.plot(X, predict_y)
plt.show()
However I want to plot regression with Seaborn. The only option I have discovered for now is the following:
sns.set(color_codes=True)
sns.set(rc={'figure.figsize':(7, 7)})
sns.regplot(x=X, y=Y);
Is there a way to provide Seaborn with the regression line predict_y = slope * X + intercept
in order to build a regression plot?
UPD: When using the following solution, proposed by RPyStats the Y-axis gets the chemical_1
name although it should be chemical_2
.
fig, ax = plt.subplots()
sns.set(color_codes=True)
sns.set(rc={'figure.figsize':(8, 8)})
ax = sns.regplot(x=X, y=Y, line_kws={'label':'$y=%3.7s*x+%3.7s$'%(slope, intercept)});
ax.legend()
sns.regplot(x=X, y=Y, fit_reg=False, ax=ax);
sns.regplot(x=X, y=predict_y,scatter=False, ax=ax);