1
votes

I'm new to linear mixed effects models and I'm trying to use them for hypothesis testing.

In my data (DF) I have two categorical/factor variables: color (red/blue/green) and direction (up/down). I want to see if there are significant differences in scores (numeric values) across these factors and if there is an interaction effect, while accounting for random intercepts and random slopes for each participant.

What is the appropriate lmer formula for doing this?


Here's what I have...

My data is structured like so:

> str(DF)

'data.frame':   4761 obs. of  4 variables:
 $ participant     : Factor w/ 100 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ direction       : Factor w/ 2 levels "down","up": 2 2 2 2 2 2 2 2 2 2 ...
 $ color           : Factor w/ 3 levels "red","blue",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ scores          : num  15 -4 5 25 0 3 16 0 5 0 ...

After some reading, I figured that I could write a model with random slopes and intercepts for participants and one fixed effect like so:

model_1 <- lmer(scores ~ direction + (direction|participant), data = DF) 

This gives me a fixed effect estimate and p-value for direction, which I understand to be a meaningful assessment of the effect of direction on scores while individual differences across participants are accounted for as a random effect.

But how do I add in my second fixed factor, color, and an interaction term whilst still affording each participant a random intercept and slope?

I thought maybe I could do this:

model_2 <- lmer(scores ~ direction * color + (direction|participant) + (color|participant), data = DF) 

But ultimately I really don't know what exactly this formula means. Any guidance would be appreciated.

1

1 Answers

1
votes

You can include several random slopes in at least two ways:

  1. What you proposed: Estimate random slopes for both predictors, but don't estimate the correlation between them (i.e. assume the random slopes of different predictors don't correlate):
    scores ~ direction * color + (direction|participant) + (color|participant)

  2. The same but also estimate the correlation between random slopes of different predictors:
    scores ~ direction * color + (direction + color|participant)

Please note two things:

First, in both cases, the random intercepts for "participant" are included, as are correlations between each random slope and the random intercept. This probably makes sense unless you have theoretical reasons to the contrary. See this useful summary if you want to avoid the correlation between random intercepts and slopes.

Second, in both cases you don't include a random slope for the interaction term! If the interaction effect is actually what you are interested in, you should at least try to fit a model with random slopes for it so to avoid potential bias in the fixed interaction effect. Here, again, you can choose to allow or avoid correltions between the interaction term's random slopes and other random slopes:
Without correlation: scores ~ direction * color + (direction|participant) + (color|participant) + (direction:color|participant)
With correlation: scores ~ direction * color + (direction * color|participant)

If you have no theoretical basis to decide between models with or without correlations between the random slopes, I suggest you do both, compare them with anova() and choose the one that fits your data better.