1
votes

I'm trying to optimize a multivariate linear regression model lmMod=lm(depend_var~var1+var2+var3+var4....,data=df) and I'm presently working on the premises of the model: the constant variance of residuals and the absence of auto-correlation. For this I'm using:

  • Breusch-Pagan test for homo/heteroscedasticity: lmtest::bptest(lmMod) 

  • Durbin Watson test for auto-correlation: durbinWatsonTest(lmMod)

I found examples which are testing either one independent variable at a time:

example for Breush-Pagan test – one independent variable: https://datascienceplus.com/how-to-detect-heteroscedasticity-and-rectify-it/

example for Durbin Watson test - one independent variable: http://math.furman.edu/~dcs/courses/math47/R/library/lmtest/html/dwtest.html

or the whole model with several independent variables at a time:

example for Durbin Watson test – multiple independent variable: https://www.rdocumentation.org/packages/car/versions/2.1-6/topics/durbinWatsonTest

Here are the questions:

  1. Can durbinWatsonTest() and bptest() be fed with a whole multivariate model
  2. If answer to 1 is yes, how is it then possible to determine which variable is causing heteroscedasticity or auto-correlation in the model in order to fix it as each of those tests give only one p-value for the entire multivariate model?
  3. If answer to 1 is no, the test should be then performed with one dependent variable at a time. But in the case of homoscedasticity, it can only be tested AFTER a particular regression has been modelled. Hence a pattern of homo/heteroscedasticity in an univariate regression model lmMod_1=lm(depend_var~var1, data=df) will be different from the pattern of a multivariate regression model lmMod_2=lm(depend_var~var1+var2+var3+var4....,data=df)

Thank very much in advance for your help!

1
I trust this question belongs on Cross validated.missuse

1 Answers

1
votes

I would like to try to give a first help

The answer to the first question: Yes, you can use the Breusch-Pagan test and the Durbin Watson test for mutlivariate models. (However, I have always used the dwtest() instead of the durbinWatsonTest()).

Also note that the dwtest() checks only the first-order autocorrelation. Unfortunately, I do not know how to find out which variable is causing heteroscedasticity or auto-correlation. However, if you encounter these problems, then one possible solution is that you use a robust estimation method, e.g. after NeweyWest (using: coeftest (regression model, vcov = NeweyWest)) at autocorrelation or with coeftest(regression model, vcov = vcovHC) at heteroscedasticity, both from the AER package.