Why is lm_robust() HC3 standard error smaller than coeftest() HC0 standard error?

Question

I am using lm_robust of package 'estimatr' for a fixed effect model including HC3 robust standard errors. I had to switch from vcovHC(), because my data sample was just to large to be handled by it.

using following line for the regression:

lm_robust(log(SPREAD) ~ PERIOD, data = dat, fixed_effects = ~ STOCKS + TIME, se_type = "HC3")

The code runs fine, and the coefficients are the same as using fixed effects from package plm. Since I can not use coeftest to estimate HC3 standard errors with the plm output due to a too large data sample, I compared the HC3 estimator of lm_robustwith the HC1 of coeftest(model, vcov= vcovHC(model, type = HC1)) As result the HC3 standarderror of lm_robust is much smaller than HC1 from coeftest.

Does somebody has an explanation, since HC3 should be more restrictive than HC1. I appreciate any recommendations and solutions.

EDIT model used for coeftest:

plm(log(SPREAD) ~ PERIOD, data = dat, index = c("STOCKS", "TIME"), effect = "twoway", method = "within")

could you share your data? also, be nice to know the size of STOCKS, TIME, and the number of observations — paqmo
Yes for sure; data can be downloaded at github.com/dpendi/data size of STOCKS is 288 and TIME has 312 days. therefore 312 * 288 observations. please see my EDIT above for both code lines used in comparison. Since I am interested in using HC3 estimator but are not able due to computational power needed in coeftest. — dpendi
i get 0.005444 for type_se = "HC1 and 0.005463 for type_se = "HC3 using estimatr_0.18.0. Looks ok to me. — paqmo
Yes, using HC1 and HC3 with the same function (lm_robust) it results in a reasonable order. But I compare the HC3 of lm_robust with the HC1 of coeftest() applied for the plm model — dpendi

paqmo paqmo · Accepted Answer · 2020-04-16T15:45:05

It appears that the vcovHC() method for plm automatically estimates cluster-robust standard errors, while for lm_robust(), it does not. Therefore, the HC1 estimation of the standard error for plm will appear inflated compared to lm_robust (of lm for that matter).

Using some toy data:

library(sandwich)
library(tidyverse)
library(plm)
library(estimatr)
library(lmtest)

set.seed(1981)
x <- sin(1:1000)
y <- 1 + x + rnorm(1000)
f <- as.character(sort(rep(sample(1:100), 10)))
t <- as.character(rep(sort(sample(1:10)), 100))

dat <- tibble(y = y, x = x, f = f, t = t)

lm_fit <- lm(y ~ x + f + t, data = dat)
plm_fit <- plm(y ~ x, index = c("f", "t"), model = "within", effect = "twoways", data = dat)
rb_fit <- lm_robust(y ~ x, fixed_effects = ~ f + t, data = dat, se_type = "HC1", return_vcov = TRUE)

sqrt(vcovHC(lm_fit, type = "HC1")[2, 2])
#> [1] 0.04752337
sqrt(vcovHC(plm_fit, type = "HC1"))
#>            x
#> x 0.05036414
#> attr(,"cluster")
#> [1] "group"
sqrt(rb_fit$vcov)
#>            x
#> x 0.04752337

rb_fit <- lm_robust(y ~ x, fixed_effects = ~ f + t, data = dat, se_type = "HC3", return_vcov = TRUE)
sqrt(vcovHC(lm_fit, type = "HC3")[2, 2])
#> [1] 0.05041177
sqrt(vcovHC(plm_fit, type = "HC3"))
#>            x
#> x 0.05042142
#> attr(,"cluster")
#> [1] "group"
sqrt(rb_fit$vcov)
#>            x
#> x 0.05041177

There does not appear to be equivalent cluster-robust standard error types in the two packages. However, the SEs get closer when specifying cluster-robust SEs in lm_robust():

rb_fit <- lm_robust(y ~ x, fixed_effects = ~ f + t, clusters = f, data = dat, se_type = "CR0")
summary(rb_fit)
#> 
#> Call:
#> lm_robust(formula = y ~ x, data = dat, clusters = f, fixed_effects = ~f + 
#>     t, se_type = "CR0")
#> 
#> Standard error type:  CR0 
#> 
#> Coefficients:
#>   Estimate Std. Error t value  Pr(>|t|) CI Lower CI Upper DF
#> x    0.925    0.05034   18.38 1.133e-33   0.8251    1.025 99
#> 
#> Multiple R-squared:  0.3664 ,    Adjusted R-squared:  0.2888
#> Multiple R-squared (proj. model):  0.3101 ,  Adjusted R-squared (proj. model):  0.2256 
#> F-statistic (proj. model): 337.7 on 1 and 99 DF,  p-value: < 2.2e-16
coeftest(plm_fit, vcov. = vcovHC(plm_fit, type = "HC1"))
#> 
#> t test of coefficients:
#> 
#>   Estimate Std. Error t value  Pr(>|t|)    
#> x 0.925009   0.050364  18.366 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

^{Created on 2020-04-16 by the reprex package (v0.3.0)}

Why is lm_robust() HC3 standard error smaller than coeftest() HC0 standard error?

1 Answers