How to understand/expand r formula with matrix term on RHS

Question

I have been doing piecewise mixed-effects growth curve regression in lme4 using the knots() function posted at https://stats.stackexchange.com/questions/13361/fitting-piecewise-linear-curves-with-lmer to create knots and then calling lmer like this:

> df$knot<-knots(df$time,seq(1.5,3.5,.5)

> lmer(outcome ~ predictor*knot + (1+knot|id), data=df)

This works fine. str(df$knot) shows that it's a matrix:

num [1:1492895, 1:5] 0 0 0 0 0 0 0 0 0 0 ...
- attr(*, "dimnames")=List of 2
 ..$ : NULL
 ..$ : chr [1:5] "1.5" "2" "2.5" "3" ...

For speed, I'd like to fit these models using Julia (via JuliaCall) instead, but Julia doesn't grok a matrix RHS like R does.

So, my question: how does R lm/lmer understand a matrix on the RHS of a formula, and can I "expand" the matrix into regular vector columns in the data frame instead so Julia won't complain?

I tried:

> df$knot1<-df$knot[,1]

> ...

> df$knot5<-df$knot[,5]

and then using the formula

outcome ~ predictor*(knot1+knot2+knot3+knot4+knot5) + (1+knot1+knot2+knot3+knot4+knot5|id)

Is that right? What does R do with the RHS matrix predictor?

If you remove knot from the random effects I think it will run pretty fast with lmer. lmer(outcome ~ predictor*knot + (1|id), data=df) . See vignette("lmerperf") and additionally follow the tips there. — G. Grothendieck
That's true, but then I wouldn't be fitting a growth model. (Or at least not the kind I'm trying to, with random slopes) — Alan Schwartz
You would still have random intercepts and knots would still be used in the fixed effects but at any rate the point is that with this change the model should run quite a bit faster. — G. Grothendieck
Thanks. I'd still like to understand what R's doing so I can try to do it "manually". — Alan Schwartz
If fm is the result of running lmer then model.matrix(fm) and model.matrix(fm, "random") will give you the model matrices. — G. Grothendieck

Alan Schwartz Alan Schwartz · Accepted Answer · 2019-02-07T03:46:26

Based on running test models with smaller data sets, it does appear that expanding the matrix into five dummy variables does work as expected, with R and Julia producing the same results (but with Julia running in 6 minutes and lme4 in about an hour for the same random intercept and slope model on a 5% subset of my data).

How to understand/expand r formula with matrix term on RHS

1 Answers