0
votes

I wondered why pdwtest() outputs very differnt p-values compared to either lmtest's and car's Durbin Watson tests (dwtest() and dwt(), respectively). Please find a documentation of the differences below. After that, I provide code I took from plm's source for pdwtest() and tried to fix the problem. Could someone have a look at that? Still the p-values do not match, but are very close. I suspect, that is due to numeric precision? Also, I am not entirely sure about the p-value for the random effects model, but that is a statistical question, not a programming question (leave the intercept in for the test?).

EDIT 2019-01-04: the generalized Durbin-Watson statistic of Bhargava et al. (1982) and Baltagi/Wu's LBI statistic are now implemented in the latest version (1.7-0) of plm as pbnftest().

I think, we have to distinct things going on here:

1) p-value: the p-value seems to be off as the additional intercept is passed to lmtest::dwtest(). My guess is, this in turn leads to a wrong calculation of the degrees of freedom and hence the suspicious p-value.

See the papers mentioned below and http://www.stata.com/manuals14/xtxtregar.pdf

Bhargava, Franzini, Narendranathan, Serial Correlation and the Fixed Effects Model, Review of Economic Studies (1982), XLIX, pp. 533-549

Baltagi, B. H., and P. X. Wu. 1999. Unequally spaced panel data regressions with AR(1) disturbances. Econometric Theory 15, pp 814–823.

Versions: R 3.1.3 plm_1.4-0 lmtest_0.9-34

require(plm)
require(lmtest)
require(car)

data("Grunfeld")

# Use lm() for pooled OLS and fixed effects
lm_pool <- lm(inv ~ value + capital, data = Grunfeld)
lm_fe   <- lm(inv ~ value + capital + factor(firm), data = Grunfeld)

# Use plm() for pooled OLS and fixed effects
plm_pool <- plm(inv ~ value + capital, data=Grunfeld, model = "pooling")
plm_fe   <- plm(inv ~ value + capital, data=Grunfeld, model = "within")
plm_re   <- plm(inv ~ value + capital, data=Grunfeld, model = "random")

# Are the estimated residuals for the pooled OLS and fixed effects model by plm() and lm() the same? => yes
all(abs(residuals(plm_pool) - residuals(lm_pool)) < 0.00000000001)
## [1] TRUE
all(abs(residuals(plm_fe)   - residuals(lm_fe))   < 0.00000000001)
## [1] TRUE

# Results match of lmtest's and car's durbin watson test match
lmtest::dwtest(lm_pool)
##  Durbin-Watson test
## 
## data:  lm_pool
## DW = 0.3582, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0

car::dwt(lm_pool)
##  lag Autocorrelation D-W Statistic p-value
##    1       0.8204959     0.3581853       0
##  Alternative hypothesis: rho != 0

lmtest::dwtest(lm_fe)
##  Durbin-Watson test
## 
## data:  lm_fe
## DW = 1.0789, p-value = 1.561e-13
## alternative hypothesis: true autocorrelation is greater than 0

car::dwt(lm_fe)
##  lag Autocorrelation D-W Statistic p-value
##    1       0.4583415      1.078912       0
##  Alternative hypothesis: rho != 0

# plm's dw statistic matches but p-value is very different (plm_pool) and slightly different (plm_fe)
pdwtest(plm_pool)
##  Durbin-Watson test for serial correlation in panel models
## 
## data:  inv ~ value + capital
## DW = 0.3582, p-value = 0.7619
## alternative hypothesis: serial correlation in idiosyncratic errors

pdwtest(plm_fe)
##  Durbin-Watson test for serial correlation in panel models
## 
## data:  inv ~ value + capital
## DW = 1.0789, p-value = 3.184e-11
## alternative hypothesis: serial correlation in idiosyncratic errors
1

1 Answers

3
votes

'plm' developer here. The strange p-values are worth investigating (notice pdwtest is just a wrapper to dwtest from package lmtest), thanks for reporting.

On the econometrics behind this: the Bharghava et al. test is basically what pdwtest() does; the Durbin-Watson test in general is a suboptimal procedure in many respects, so that most modern textbooks rather suggest Breusch-Godfrey (see pbgtest() in 'plm' for a panel version). RE is fine because transformed residuals are "white" under H0. FE are to be taken with care because of the serial correlation induced by the demeaning, so that DW and BG tests are generally inappropriate except for very long panels: see my comments in JStatSoft 27/2, 2008, p.26. Better alternatives, especially for wide panels, are the tests by Wooldridge: first difference test (pwfdtest() in 'plm', also in Stata, see the paper by Drukker) and the test implemented in 'plm' as pwartest(), see again JStatSoft 27/2, 6.4.