2
votes

I am trying to find the Studentized and PRESS residual of multiple regression model using python. In this case I have the following data:

X1  X2  Y
14  25  301
19  32  327
12  22  246
11  15  187

And the fitted model is : Y=80.93−5.84 X1 + 11.32 X2 and MSresidual : 574.9 I have written the following code to find those residuals.

import math
def lin_model(X1, X2):
    Y_hat = 80.93 - 5.84 * X1 + 11.32 * X2
    return Y_hat

MSresiduals = 574.9
X1 = [14, 19, 12, 11]
X2 = [25, 32, 22, 15]
Y=[301, 327, 246, 187]

i=0
hii = 0
print('Residual Standardized_Residual Studentized_Residual PRESS_Residual')
for z in Y:
    err = z - lin_model(X1[i],X2[i])
    sd_r = err / math.sqrt(MSresiduals)
    st_r = err / math.sqrt(MSresiduals * (1 - hii))
    press_r = err / (1-hii)
    print(err,' ',sd_r,' ', st_r,' ',press_r)
    i+=1

But the value of hii is not known to me. hii is a diagonal element of the hat matrix. But I could not figure out how to implement hat matrix and to get the value of hii. someone please help me to find the value of hii from the given data, so that I can calculate studentized and Press residual using the above formula in the code. Here st_r is studentized and press_r is PRESS residual. I do not want to use any python library. Thanks in advance.

1

1 Answers

1
votes

There is code for pure matrix multiply at Matrix Multiplication in python?

And the Python transpose can be calculated with the zip function shown here Matrix Transpose in Python

The hat matrix can then be calculated as (except with using the functions at the above references): H = X(X^TX)^-lX^T

I got these values for the leverages (Hii): 0.387681, 0.951288, 0.661433, 0.999597. And the PRESS = 1442464

NOTE that model R-square is good but model predicted R-square (predicted R-squared = [1 - (PRESS / total sums of squares)] * 100) is 0. Also X1 and X2 pvalues are not statistically significant. Very limited data set.