1
votes

I have a very simple case of 3 Datapoints and I would like to do a linear fit y=a0 + a1x through those points using np.polyfit or scipy.stats.linregress.

For the further error propagation I need the errors in the slope and the intercept. I am by far no expert in statistics but on the scipy side I am only aware of the stderr which does not split in slope and intercept. Polyfit has the possibly to estimate the covariance matrix, but this does not work with only 3 datapoints.

When using qtiplot for example it yields errors for slope and intercept.

B (y-intercept) = 9,291335740072202e-12 +/- 2,391260092282606e-13
A (slope) = 2,527075812274368e-12 +/- 6,878180102259077e-13

What would be the appropiate way to calculate these in python?

EDIT:

np.polyfit(x, y, 1, cov=True)

results in

ValueError: the number of data points must exceed order + 2 for Bayesian estimate the covariance matrix

1
How does the covariance matrix for polyfit "not work"? Please show us your example where it fails.9769953
Why arent you using scipy.stats.linregress it gives slope, intercept, correleation coefficient, p value & standard error?DrBwts

1 Answers

0
votes

scipy.stats.linregress gives you slope, intercept, correleation coefficient, p value & standard error. The fitted line does not have errors associated with its slope or intercept, the errors are to do with the distances of the points from the line. Have a read through this to clear up the point

An example...

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt

points = np.array([[1, 3], [2, 4], [2, 7]])
slope, intercept, r_value, p_value, std_err = stats.linregress(points)

print("slope = ", slope)
print("intercept = ", intercept)
print("R = ", r_value)
print("p = ", p_value)
print("Standard error = ", std_err)

for xy in points:
    plt.plot(xy[0], xy[1], 'ob')

x = np.linspace(0, 10, 100)
y = slope * x + intercept
plt.plot(x, y, '-r')

plt.grid()
plt.show()