Correct syntax for covariate in mixed models, or, asterisk vs. slash in lme/lmer

Question

I am doing a regression analysis on ecological data that is body size dependent, i.e. body size should be a covariate. So, I have two categorical variables, and one continuous. Moreover, two random effects (spatial block-structure, experimental unit nested within - both categorical). I am using lme and lmer, and my model looks like this (in lmer-syntax):

dep-var ~ fix-var1(cat) * fix-var2(cat) * covariate(cont) + (1|block/exp-unit)

Someone suggested using a slash / instead of the asterisk *for an ANCOVA model, so the formula would look like

dep-var ~ fix-var1 * fix-var2 / covariate + (1|block/exp-unit)

However, this gives me completely different output, suddenly interactions become significant, and main effects are gone. I have not been able to find details on what the exact differences between the use of these operators are.

Can anybody please enlighten me?

what is your random effect? (you must have a random effect if you're using lme or lmer ...) — Ben Bolker
oh no, looks like this is going to be another tumbleweed badge. did I miss something here? — mluerig
I should be able to get to this soon. I've been traveling. VERY short answer: the slash doesn't make sense. — Ben Bolker
PS you don't get a tumbleweed because there were comments ... sorry :-) — Ben Bolker

Ben Bolker Ben Bolker · Accepted Answer · 2017-12-22T17:00:16

Like many relating to the interpretation of model formulas in R, this question is not specific to mixed models, but applies generally to R's formula-expanding machinery.

A good way to figure out what R is doing when it interprets a formula is to use model.matrix() (which constructs the underlying model matrix that R uses to fit the model) and look at the column names of the output.

Here's an example with a 2x2 factorial design, plus a covariate:

dd <- expand.grid(fv1=c("a","b"),
                  fv2=c("A","B"))
dd$covar <- 1:4

Your first way:

colnames(model.matrix(~fv1*fv2*covar,dd))
## [1] "(Intercept)"     "fv1b"            "fv2B"            "covar"           "fv1b:fv2B"      
## [6] "fv1b:covar"      "fv2B:covar"      "fv1b:fv2B:covar"

There are a total of 8 parameters ((intercept + slope) x 2 levels of fv1 x 2 levels of fv2). The model is parameterized as the intercept of (a,A) ((Intercept)); differences in the intercept according to the factors (b, B) and their interaction; slope of (a,A) (covar); and the differences in the slope according to the factors and their interaction.

What happens if we use / instead?

colnames(model.matrix(~fv1*fv2/covar,dd))
## [1] "(Intercept)"     "fv1b"            "fv2B"            "fv1b:fv2B"       "fv1a:fv2A:covar"
## [6] "fv1b:fv2A:covar" "fv1a:fv2B:covar" "fv1b:fv2B:covar"

The parameterization for the intercepts looks the same, but the parameterization for the slopes estimates a separate slope for each factor combination rather than estimating the slope for (a,A) and the differences of the slopes from that baseline for b, B and their interaction. This is most likely not what you want, unless you want to test the individual slopes against a baseline of zero (rather than testing the differences among slopes across factor combinations).

If you instead specify the model as ~(fv1*fv2)/covar, then both the intercept and the slope parameters get expanded into factor-combination estimates rather than differences-among-factors estimates.

colnames(model.matrix(~(fv1*fv2)/covar,dd))
## [1] "(Intercept)"     "fv1b"           
## [3] "fv2B"            "fv1b:fv2B"      
## [5] "fv1a:fv2A:covar" "fv1b:fv2A:covar"
## [7] "fv1a:fv2B:covar" "fv1b:fv2B:covar"

Correct syntax for covariate in mixed models, or, asterisk vs. slash in lme/lmer

1 Answers