I'm trying to optimize a multivariate linear regression model lmMod=lm(depend_var~var1+var2+var3+var4....,data=df)
and I'm presently working on the premises of the model: the constant variance of residuals and the absence of auto-correlation. For this I'm using:
Breusch-Pagan test for homo/heteroscedasticity:
lmtest::bptest(lmMod)
Durbin Watson test for auto-correlation:
durbinWatsonTest(lmMod)
I found examples which are testing either one independent variable at a time:
example for Breush-Pagan test – one independent variable: https://datascienceplus.com/how-to-detect-heteroscedasticity-and-rectify-it/
example for Durbin Watson test - one independent variable: http://math.furman.edu/~dcs/courses/math47/R/library/lmtest/html/dwtest.html
or the whole model with several independent variables at a time:
example for Durbin Watson test – multiple independent variable: https://www.rdocumentation.org/packages/car/versions/2.1-6/topics/durbinWatsonTest
Here are the questions:
- Can
durbinWatsonTest()
andbptest()
be fed with a whole multivariate model - If answer to 1 is yes, how is it then possible to determine which variable is causing heteroscedasticity or auto-correlation in the model in order to fix it as each of those tests give only one p-value for the entire multivariate model?
- If answer to 1 is no, the test should be then performed with one dependent variable at a time. But in the case of homoscedasticity, it can only be tested AFTER a particular regression has been modelled. Hence a pattern of homo/heteroscedasticity in an univariate regression model
lmMod_1=lm(depend_var~var1, data=df)
will be different from the pattern of a multivariate regression modellmMod_2=lm(depend_var~var1+var2+var3+var4....,data=df)
Thank very much in advance for your help!