Mixed model starting values for lme4

Question

I am trying to fit a mixed model using the lmer function from the lme4 package. However, I do not understand what should be input to the start parameter. My purpose is to use a simple linear regression to use the coefficients estimated there as starting values to the mixed model.

Lets say that my model is the following:

linear_model = lm(y ~ x1 + x2 + x3, data = data)
coef = summary(linear_model)$coefficients[- 1, 1] #I remove the intercept
result = lmer(y ~ x1 + x2 + x3 | x1 + x2 + x3, data = data, start = coef)

This example is an oversimplified version of what I am doing since I won't be able to share my data.

Then I get the following kind of error:

Error during wrapup: incorrect number of theta components (!=105) #105 is the value I get from the real regression I am trying to fit.

I have tried many different solutions, trying to provide a list and name those values theta like I saw suggested on some forums.

Also the Github code test whether the length is appropriate but I cant find to what it refers to:

# Assign the start value to theta
if (is.numeric(start)) {
        theta <- start
}

# Check the length of theta
length(theta)!=length(pred$theta)

However I can't find where pred$theta is defined and so I don't understand where that value 105 is coming from.

Any help ?

It will provide a starting value if you don't. Just leave it out. — G. Grothendieck
I want to provide one in order to speed up the computation. The idea was to run a linear regression and use the coefficients obtained as the starting values. — Yohan Obadia
It already does computations internally to get good starting values particularly if your only mixed terms are of the form 1|x. Note that there is a lme4 vignette on performance. — G. Grothendieck
It is rather of the shape y ~ x | x. How does it decides which starting values to use ? — Yohan Obadia
You may be better off trying to play with the stoppiong criteria rather than the starting values. Again, see the vignette. — G. Grothendieck

Ben Bolker Ben Bolker · Accepted Answer · 2016-09-09T04:10:43

A few points:

lmer doesn't in fact fit any of the fixed-effect coefficients explicitly; these are profiled out so that they are solved for implicitly at each step of the nonlinear estimation process. The estimation involves only a nonlinear search over the variance-covariance parameters. This is detailed (rather technically) in one of the lme4 vignettes (eqs. 30-31, p. 15). Thus providing starting values for the fixed-effect coefficients is impossible, and useless ...
glmer does fit fixed-effects coefficients explicitly as part of the nonlinear optimization (as @G.Grothendieck discusses in comments), if nAGQ>0 ...
it's admittedly rather obscure, but the starting values for the theta parameters (the only ones that are explicitly optimized in lmer fits) are 0 for the off-diagonal elements of the Cholesky factor, 1 for the diagonal elements: this is coded here

   ll$theta[] <- is.finite(ll$lower) # initial values of theta are 0 off-diagonal, 1 on

... where you need to know further that, upstream, the values of the lower vector have been coded so that elements of the theta vector corresponding to diagonal elements have a lower bound of 0, off-diagonal elements have a lower bound of -Inf; this is equivalent to starting with an identity matrix for the scaled variance-covariance matrix (i.e., the variance-covariance matrix of the random-effects parameters divided by the residual variance), or a random-effects variance-covariance matrix of (sigma^2 I).

If you have several random effects and big variance-covariance matrices for each, things can get a little hairy. If you want to recover the starting values that lmer will use by default you can use lFormula() as follows:

library(lme4)
ff <- lFormula(Reaction~Days+(Days|Subject),sleepstudy)
(lwr <- ff$reTrms$lower)
## [1]    0 -Inf    0
ifelse(lwr==0,1,0)  ## starting values
## [1] 1 0 1

For this model, we have a single 2x2 random-effects variance-covariance matrix. The theta parameters correspond to the lower-triangle Cholesky factor of this matrix, in column-wise order, so the first and third elements are diagonal, and the second element is off-diagonal.

The fact that you have 105 theta parameters worries me; fitting such a large random-effects model will be extremely slow and take an enormous amount of data to fit reliably. (If you know your model makes sense and you have enough data you might want to look into faster options, such as using Doug Bates's MixedModels package for Julia or possibly glmmTMB, which might scale better than lme4 for problems with large theta vectors ...)
your model formula, y ~ x1 + x2 + x3 | x1 + x2 + x3, seems very odd. I can't figure out any context in which it would make sense to have the same variables as random-effect terms and grouping variables in the same model!

Mixed model starting values for lme4

1 Answers