I am replicating a negative binomial regression model in R. When calculating robust standard errors, the output does not match Stata output of standard errors.
The original Stata code is
nbreg displaced eei lcostofwar cfughh roadskm lpopdensity ltkilled, robust nolog
I have attempted both manual calculation and vcovHC
from sandwich
. However, neither produces the same results.
My regression model is as follows:
mod1 <- glm.nb(displaced ~ eei + costofwar_log + cfughh + roadskm + popdensity_log + tkilled_log, data = mod1_df)
With vcovHC
I have tried every option from HC0
to HC5
.
Attempt 1:
cov_m1 <- vcovHC(mod1, type = "HC0", sandwich = T)
se <- sqrt(diag(cov_m1))
Attempt 2:
mod1_rob <- coeftest(mod1, vcovHC = vcov(mod1, type = "HC0"))
The most successful has been HC0
and vcov = sandwich
but no SEs are correct.
Any suggestions?
EDIT
My output is as follows (using HC0
):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.3281183 1.5441312 0.8601 0.389730
eei -0.0435529 0.0183359 -2.3753 0.017536 *
costofwar_log 0.2984376 0.1350518 2.2098 0.027119 *
cfughh -0.0380690 0.0130254 -2.9227 0.003470 **
roadskm 0.0020812 0.0010864 1.9156 0.055421 .
popdensity_log -0.4661079 0.1748682 -2.6655 0.007688 **
tkilled_log 1.0949084 0.2159161 5.0710 3.958e-07 ***
The Stata output I am attempting to replicate is:
Estimate Std. Error
(Intercept) 1.328 1.272
eei -0.044 0.015
costofwar_log 0.298 0.123
cfughh -0.038 0.018
roadskm 0.002 0.0001
popdensity_log -0.466 0.208
tkilled_log 1.095 0.209
The dataset is found here and the recoded variables are:
mod1_df <- table %>%
select(displaced, eei_01, costofwar, cfughh, roadskm, popdensity,
tkilled)
mod1_df$popdensity_log <- log(mod1_df$popdensity + 1)
mod1_df$tkilled_log <- log(mod1_df$tkilled + 1)
mod1_df$costofwar_log <- log(mod1_df$costofwar + 1)
mod1_df$eei <- mod1_df$eei_01*100