1
votes

I am doing a regression analysis on ecological data that is body size dependent, i.e. body size should be a covariate. So, I have two categorical variables, and one continuous. Moreover, two random effects (spatial block-structure, experimental unit nested within - both categorical). I am using lme and lmer, and my model looks like this (in lmer-syntax):

dep-var ~ fix-var1(cat) * fix-var2(cat) * covariate(cont) + (1|block/exp-unit)

Someone suggested using a slash / instead of the asterisk *for an ANCOVA model, so the formula would look like

dep-var ~ fix-var1 * fix-var2 / covariate + (1|block/exp-unit)

However, this gives me completely different output, suddenly interactions become significant, and main effects are gone. I have not been able to find details on what the exact differences between the use of these operators are.

Can anybody please enlighten me?

1
what is your random effect? (you must have a random effect if you're using lme or lmer ...)Ben Bolker
right, forgot to mention. edited the questionmluerig
oh no, looks like this is going to be another tumbleweed badge. did I miss something here?mluerig
I should be able to get to this soon. I've been traveling. VERY short answer: the slash doesn't make sense.Ben Bolker
PS you don't get a tumbleweed because there were comments ... sorry :-)Ben Bolker

1 Answers

0
votes

Like many relating to the interpretation of model formulas in R, this question is not specific to mixed models, but applies generally to R's formula-expanding machinery.

A good way to figure out what R is doing when it interprets a formula is to use model.matrix() (which constructs the underlying model matrix that R uses to fit the model) and look at the column names of the output.

Here's an example with a 2x2 factorial design, plus a covariate:

dd <- expand.grid(fv1=c("a","b"),
                  fv2=c("A","B"))
dd$covar <- 1:4

Your first way:

colnames(model.matrix(~fv1*fv2*covar,dd))
## [1] "(Intercept)"     "fv1b"            "fv2B"            "covar"           "fv1b:fv2B"      
## [6] "fv1b:covar"      "fv2B:covar"      "fv1b:fv2B:covar"

There are a total of 8 parameters ((intercept + slope) x 2 levels of fv1 x 2 levels of fv2). The model is parameterized as the intercept of (a,A) ((Intercept)); differences in the intercept according to the factors (b, B) and their interaction; slope of (a,A) (covar); and the differences in the slope according to the factors and their interaction.

What happens if we use / instead?

colnames(model.matrix(~fv1*fv2/covar,dd))
## [1] "(Intercept)"     "fv1b"            "fv2B"            "fv1b:fv2B"       "fv1a:fv2A:covar"
## [6] "fv1b:fv2A:covar" "fv1a:fv2B:covar" "fv1b:fv2B:covar"

The parameterization for the intercepts looks the same, but the parameterization for the slopes estimates a separate slope for each factor combination rather than estimating the slope for (a,A) and the differences of the slopes from that baseline for b, B and their interaction. This is most likely not what you want, unless you want to test the individual slopes against a baseline of zero (rather than testing the differences among slopes across factor combinations).

If you instead specify the model as ~(fv1*fv2)/covar, then both the intercept and the slope parameters get expanded into factor-combination estimates rather than differences-among-factors estimates.

colnames(model.matrix(~(fv1*fv2)/covar,dd))
## [1] "(Intercept)"     "fv1b"           
## [3] "fv2B"            "fv1b:fv2B"      
## [5] "fv1a:fv2A:covar" "fv1b:fv2A:covar"
## [7] "fv1a:fv2B:covar" "fv1b:fv2B:covar"