Difference in Standard Errors Between Python’s linearmodels.PanelOLS and Stata‘s xtreg, fe when Using Robust Standard Errors

Question

I reproduced an example from the linearmodels PanelOLS introduction, and included robust standard errors to learn how to use the module. Here is the code I used

from linearmodels.datasets import jobtraining
import statsmodels.api as sm2
data = jobtraining.load()
mi_data = data.set_index(['fcode', 'year'])
mi_data.head()
from linearmodels import PanelOLS
mod = PanelOLS(mi_data.lscrap, sm2.add_constant(mi_data.hrsemp), entity_effects=True)
print(mod.fit(cov_type='robust'))

                          PanelOLS Estimation Summary                           
================================================================================
Dep. Variable:                 lscrap   R-squared:                        0.0528
Estimator:                   PanelOLS   R-squared (Between):             -0.0029
No. Observations:                 140   R-squared (Within):               0.0528
Date:                Tue, May 05 2020   R-squared (Overall):              0.0048
Time:                        10:49:58   Log-likelihood                   -90.459
Cov. Estimator:                Robust                                           
                                        F-statistic:                      5.0751
Entities:                          48   P-value                           0.0267
Avg Obs:                       2.9167   Distribution:                    F(1,91)
Min Obs:                       1.0000                                           
Max Obs:                       3.0000   F-statistic (robust):             8.2299
                                        P-value                           0.0051
Time periods:                       3   Distribution:                    F(1,91)
Avg Obs:                       46.667                                           
Min Obs:                       46.000                                           
Max Obs:                       48.000                                           

                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
const          0.4982     0.0555     8.9714     0.0000      0.3879      0.6085
hrsemp        -0.0054     0.0019    -2.8688     0.0051     -0.0092     -0.0017
==============================================================================

F-test for Poolability: 17.094
P-value: 0.0000
Distribution: F(47,91)

Included effects: Entity

When I compared the results to how I’m used to perform a fixed effects regression using robust standard errors, I saw that the standard errors are very different.

xtset fcode year
xtreg lscrap hrsemp  , fe vce(robust)
Fixed-effects (within) regression               Number of obs      =       140
Group variable: fcode                           Number of groups   =        48

R-sq:  within  = 0.0528                         Obs per group: min =         1
       between = 0.0002                                        avg =       2.9
       overall = 0.0055                                        max =         3

                                                F(1,47)            =      7.93
corr(u_i, Xb)  = -0.0266                        Prob > F           =    0.0071

                                 (Std. Err. adjusted for 48 clusters in fcode)
------------------------------------------------------------------------------
             |               Robust
      lscrap |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
      hrsemp |  -.0054186   .0019243    -2.82   0.007    -.0092897   -.0015474
       _cons |   .4981764   .0295415    16.86   0.000     .4387464    .5576063
-------------+----------------------------------------------------------------
     sigma_u |  1.4004191
     sigma_e |  .57268937
         rho |  .85672692   (fraction of variance due to u_i)
------------------------------------------------------------------------------

I don’t understand where the difference is coming from, as the results are (almost) identical without the robust SE. How can I use the same robust SE like in Stata using Pythons linearmodels.PanelOLS?

TiTo TiTo · Accepted Answer · 2020-05-05T16:54:19

White's robust covariance, which is used in Python with the cov_type='robust' option are not robust for fixed effects models. You should use cov_type='robust',cluster_entity=True instead. Here is the corresponding manual entry from linearmodels.

Full code:

from linearmodels.datasets import jobtraining
import statsmodels.api as sm2
data = jobtraining.load()
mi_data = data.set_index(['fcode', 'year'])
mi_data.head()
from linearmodels import PanelOLS
mod = PanelOLS(mi_data.lscrap, sm2.add_constant(mi_data.hrsemp), entity_effects=True)
print(mod.fit(cov_type='robust',cluster_entity=True))

and the corresponding output is almost similar to the Stata one:

                          PanelOLS Estimation Summary                           
================================================================================
Dep. Variable:                 lscrap   R-squared:                        0.0528
Estimator:                   PanelOLS   R-squared (Between):             -0.0029
No. Observations:                 140   R-squared (Within):               0.0528
Date:                Tue, May 05 2020   R-squared (Overall):              0.0048
Time:                        18:53:06   Log-likelihood                   -90.459
Cov. Estimator:                Robust                                           
                                        F-statistic:                      5.0751
Entities:                          48   P-value                           0.0267
Avg Obs:                       2.9167   Distribution:                    F(1,91)
Min Obs:                       1.0000                                           
Max Obs:                       3.0000   F-statistic (robust):             8.2299
                                        P-value                           0.0051
Time periods:                       3   Distribution:                    F(1,91)
Avg Obs:                       46.667                                           
Min Obs:                       46.000                                           
Max Obs:                       48.000                                           

                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
const          0.4982     0.0555     8.9714     0.0000      0.3879      0.6085
hrsemp        -0.0054     0.0019    -2.8688     0.0051     -0.0092     -0.0017
==============================================================================

F-test for Poolability: 17.094
P-value: 0.0000
Distribution: F(47,91)

Included effects: Entity

Difference in Standard Errors Between Python’s linearmodels.PanelOLS and Stata‘s xtreg, fe when Using Robust Standard Errors

1 Answers