Random Effects Esimation on cluter sample using plm on a unbalanced panel

Question

I normally work within a generalized least squares framework estimating, what Wooldridge's Introductory (2013) calls, Random Effects and Fixed Effects models on longitudinal data indexed by an individual and a time dimension.

I've been using the Feasible GLS estimation in plm(), from the plm package, to estimate the Random Effects Model – what some stats literature term the Mixed Model. The plm() function takes an index argument where I indicate the individual and time indexes. However, I’m now faced with some data where each individual has several measures at each time-point, i.e. what a group-wise structure.

I’ve found out that it’s possible to set up such a model using lmer() from the lme4 package, however I am a bit confused by the differences in jargon, and also the likelihood framework, and I wanted to know if specified the model correctly. I fear I could overlook at more substantial as I am not familiar with the framework and this terminology.

I can replicate my usual plm() model using lmer(), but I am unsure as to how I could add the grouping. I’ve tried to illustrate my question in the following.

I found some data that looks somewhat like my data to illustrate my situation. First some packages that are needed,

install.packages(c("mlmRev", "plm", "lme4", "stargazer"), dependencies = TRUE)

and then the data

data(egsingle, package = "mlmRev")

egsingle is a unbalanced panel consisting of 1721 school children, grouped in 60 schools, across five time points. These data are originally distributed with the HLM software package (Bryk, Raudenbush and Congdon, 1996), but can be found the mlmrev package, for details see ? mlmRev::egsingle

Some light data management

dta <- egsingle
dta$Female <- with(dta, ifelse(female == 'Female', 1, 0))

Here’s a snippet for the data

dta[118:127,c('schoolid','childid','math','year','size','Female')]
#>     schoolid   childid   math year size Female
#> 118     2040 289970511 -1.830 -1.5  502      1
#> 119     2040 289970511 -1.185 -0.5  502      1
#> 120     2040 289970511  0.852  0.5  502      1
#> 121     2040 289970511  0.573  1.5  502      1
#> 122     2040 289970511  1.736  2.5  502      1
#> 123     2040 292772811 -3.144 -1.5  502      0
#> 124     2040 292772811 -2.097 -0.5  502      0
#> 125     2040 292772811 -0.316  0.5  502      0
#> 126     2040 293550291 -2.097 -1.5  502      0
#> 127     2040 293550291 -1.314 -0.5  502      0

Here’s how I would set a random effects model without the schoolid using plm(),

library(plm)
reg.re.plm <- plm(math~Female+size+year, dta, index = c("childid", "year"), model="random")
# summary(reg.re.plm)

I can reproduce these results lme4 like this

require(lme4)
dta$year <- as.factor(dta$year)
reg.re.lmer <- lmer(math~Female+size+year+(1|childid), dta)
# summary(reg.re.lmer)

Now, from reading chapter 2 in Bates (2010) “lme4: Mixed-effects modeling with R” I believe I’ve this is how I would specific the model including the cluster level, schoolid,

reg.re.lmer.in.school <- lmer(math~Female+size+year+(1|childid)+(1|schoolid), dta)
# summary(reg.re.lmer.in.school)

However, when I look at the results I am not too convinced I’ve actually specified it correctly (see below).

In my actual data the repeated measures are within individuals, but I take that I can use this data as example. I would appreciate any advice on how to proceed. Maybe a reference to a worked example with notation/terminology not too far from what is used in Wooldridge (2013). And, how do I work backwards and write up the specification for the reg.re.lmer.in.school model?

# library(stargazer)
stargazer::stargazer(reg.re.plm, reg.re.lmer, reg.re.lmer.in.school, type="text")
#> =====================================================================
#>                                    Dependent variable:               
#>                     -------------------------------------------------
#>                                           math                       
#>                                panel                   linear        
#>                               linear                mixed-effects    
#>                                 (1)                (2)        (3)    
#> ---------------------------------------------------------------------
#> Female                        -0.025              -0.025     0.008   
#>                               (0.046)            (0.047)    (0.042)  
#>                                                                      
#> size                        -0.0004***          -0.0004***  -0.0003  
#>                              (0.0001)            (0.0001)   (0.0002) 
#>                                                                      
#> year-1.5                     0.878***            0.876***   0.866*** 
#>                               (0.059)            (0.059)    (0.059)  
#>                                                                      
#> year-0.5                     1.882***            1.880***   1.870*** 
#>                               (0.059)            (0.058)    (0.058)  
#>                                                                      
#> year0.5                      2.575***            2.574***   2.562*** 
#>                               (0.059)            (0.059)    (0.059)  
#>                                                                      
#> year1.5                      3.149***            3.147***   3.133*** 
#>                               (0.060)            (0.059)    (0.059)  
#>                                                                      
#> year2.5                      3.956***            3.954***   3.939*** 
#>                               (0.060)            (0.060)    (0.060)  
#>                                                                      
#> Constant                     -2.671***          -2.669***  -2.693*** 
#>                               (0.085)            (0.086)    (0.152)  
#>                                                                      
#> ---------------------------------------------------------------------
#> Observations                   7,230              7,230      7,230   
#> R2                             0.735                                 
#> Adjusted R2                    0.735                                 
#> Log Likelihood                                  -8,417.815 -8,284.357
#> Akaike Inf. Crit.                               16,855.630 16,590.720
#> Bayesian Inf. Crit.                             16,924.490 16,666.460
#> F Statistic         2,865.391*** (df = 7; 7222)                      
#> =====================================================================
#> Note:                                     *p<0.1; **p<0.05; ***p<0.01

Eric Fail Eric Fail · Accepted Answer · 2018-03-16T10:52:34

After having studied Robert Long's great answer on stats.stackexchange I have found the the correct specification of the model is a nested design, i.e. (1| schoolid /childid). However due to the way the data is coded (unniqe childid's within schoolid) the crossed design or specification, i.e. (1|childid)+(1|schoolid) what I used above, yields identical results.

Here is an illustration using the same data as above,

data(egsingle, package = "mlmRev")
dta <- egsingle
dta$Female <- with(dta, ifelse(female == 'Female', 1, 0))

require(lme4)
dta$year <- as.factor(dta$year)

Rerunning the crossed design-model, , reg.re.lmer.in.school, for comparison

reg.re.lmer.in.school <- lmer(math~Female+size+year+(1|childid)+(1|schoolid), dta)

here the nested structure

reg.re.lmer.nested <- lmer(math~Female+size+year+(1| schoolid /childid), dta)

and finally the comparison of the two models using the amazing texreg package,

# install.packages(c("texreg"), dependencies = TRUE)
#  require(texreg)
texreg::screenreg(list(reg.re.lmer.in.school, reg.re.lmer.nested), digits = 3)
#> ===============================================================
#>                                    Model 1        Model 2      
#> ---------------------------------------------------------------
#> (Intercept)                           -2.693 ***     -2.693 ***
#>                                       (0.152)        (0.152)   
#> Female                                 0.008          0.008    
#>                                       (0.042)        (0.042)   
#> size                                  -0.000         -0.000    
#>                                       (0.000)        (0.000)   
#> year-1.5                               0.866 ***      0.866 ***
#>                                       (0.059)        (0.059)   
#> year-0.5                               1.870 ***      1.870 ***
#>                                       (0.058)        (0.058)   
#> year0.5                                2.562 ***      2.562 ***
#>                                       (0.059)        (0.059)   
#> year1.5                                3.133 ***      3.133 ***
#>                                       (0.059)        (0.059)   
#> year2.5                                3.939 ***      3.939 ***
#>                                       (0.060)        (0.060)   
#> ---------------------------------------------------------------
#> AIC                                16590.715      16590.715    
#> BIC                                16666.461      16666.461    
#> Log Likelihood                     -8284.357      -8284.357    
#> Num. obs.                           7230           7230        
#> Num. groups: childid                1721                       
#> Num. groups: schoolid                 60             60        
#> Var: childid (Intercept)               0.672                   
#> Var: schoolid (Intercept)              0.180          0.180    
#> Var: Residual                          0.334          0.334    
#> Num. groups: childid:schoolid                      1721        
#> Var: childid:schoolid (Intercept)                     0.672    
#> ===============================================================
#> *** p < 0.001, ** p < 0.01, * p < 0.05

Random Effects Esimation on cluter sample using plm on a unbalanced panel

1 Answers