7
votes

I'm using some fairly straight-forward SQL code to calculate the coefficients of regression (intercept and slope) of some (x,y) data points, using least-squares. This gives me a nice best-fit line through the data. However we would like to be able to see the 95% and 5% confidence intervals for the line of best-fit (the curves below).

link text
(source: curvefit.com)

What these mean is that the true line has 95% probability of being below the upper curve and 95% probability of being above the lower curve. How can I calculate these curves? I have already read wikipedia etc. and done some googling but I haven't found understandable mathematical equations to be able to calculate this.

Edit: here is the essence of what I have right now.

--sample data
create table #lr (x real not null, y real not null)
insert into #lr values (0,1)
insert into #lr values (4,9)
insert into #lr values (2,5)
insert into #lr values (3,7)

declare @slope real
declare @intercept real

--calculate slope and intercept
select 
@slope = ((count(*) * sum(x*y)) - (sum(x)*sum(y)))/
((count(*) * sum(Power(x,2)))-Power(Sum(x),2)),
@intercept = avg(y) - ((count(*) * sum(x*y)) - (sum(x)*sum(y)))/
((count(*) * sum(Power(x,2)))-Power(Sum(x),2)) * avg(x)
from #lr

Thank you in advance.

3

3 Answers

1
votes

An equation for confidence interval width as f(x) is given here under "Confidence Interval on Fitted Values"

http://www.weibull.com/DOEWeb/confidence_intervals_in_simple_linear_regression.htm

The page walks you through an example calculation too.

0
votes

Try this site and scroll down to the middle. For each point of your best fit line, you know your Z, your sample size, and your std Deviation.

http://www.stat.yale.edu/Courses/1997-98/101/confint.htm

0
votes

@PowerUser: He needs to use the equations for two-variable setups, not for one-variable setups.

Matt: If I had my old Statistics textbook with me, I'd be able to tell you what you want; unfortunately, I don't have it with me, nor do I have my notes from my high school statistics course. On the other hand, from what I remember it may only have had stuff for the confidence interval of the regression line's slope...

Anyway, this page will hopefully be of some help: http://www.stat.yale.edu/Courses/1997-98/101/linregin.htm.