mgcv: Fix smoothing parameter in GAMM and validity of model nesting

Question

I am currently trying to model the interaction between a covariate x (Age) and a gender 0-1 variable with gamm from the mgcv package. After specifying a main model (lets call it M0) with a smooth term for each gender, I would like to test the simpler hypothesis that the difference between genders is linear (and not arbitrarily smooth). I have the following two questions:

When trying to nest the models properly I would like to get out the smoothing parameter for the 0-gender smooth from M0, and use it in the simpler model specification. But I get the following error message:

Error in gamm.setup(gp, pterms = pTerms, data = mf, knots = knots, parametric.only = FALSE, :

gamm can not handle linked smoothing parameters (probably from use of `id' or adaptive smooths)

Second question is the more stupid one. Are the models even nested when I go from a smooth for each gender to a 0-gender smooth and a linear difference to 1-gender?

Below follows an example. I simulated some random data, so the data does not display the behaviour of my actual data, but the problems remain the same.

library(mgcv) 

### Simulate random data
x <- rnorm(100, mean = 10, sd = 1.5)
y <- rnorm(100, mean = 1, sd = 0.025)

id <- sample(1:10, size = 100, replace = T)
id <- as.factor(id)

gender <- sample(c(0,1), size = 100, replace = T)


### Specify main model, M0
ctrl <- list(niterEM=0,optimMethod="L-BFGS-B", msMaxIter = 100)
M0 <- gamm(y~s(x, by = as.factor(gender)) + gender, 
           random=list(id=~1+x), control=ctrl)

### Specify model with linear difference between gender0 and gender1
M1 <- gamm(y~s(x) + gender:x + gender, 
           random=list(id=~1+x), control=ctrl)

### Testing
anova(M0$lme, M1$lme)

### Problems:
sp0 <- M0$gam$sp[1]
M1 <- gamm(y~s(x, sp = sp0) + gender:x + gender, 
           random=list(id=~1+x), control=ctrl)

Error in gamm.setup(gp, pterms = pTerms, data = mf, knots = knots, parametric.only = FALSE, :

gamm can not handle linked smoothing parameters (probably from use of `id' or adaptive smooths)

Any thoughts? Thanks in advance.

Zheyuan Li Zheyuan Li · Accepted Answer · 2016-11-09T12:41:40

Regarding the `gamm` error

This is a very interesting thing... Well, I should explain the logic first.

In principle it is illegal to fix a smoothing parameter in gamm, because gamm will treat the wiggly components of the smooth as random effects, the variance of which to be estimated by lme (as you have Gaussian data). If you try to fix the smoothing parameter, that is essentially saying that you want to fix the variance of the random effect. Well, lme does not allow you to do this (and I doubt whether such attempt is legal in mixed modelling, either.)

Therefore gamm would disable any possible constraints on smoothing parameters, including:

lower bound of smoothing parameter, via min.sp;
linked smooth sharing the same smoothing parameter, via id in s();
directly specified smoothing parameter via sp, via s().

The first two are perfectly checked. gamm has no min.sp argument like gam; even if you specify it via ..., there is no chance it would get used (as later it's NULL that's passed to gam.setup during gamm.setup, so your specified min.sp is ignored). Specification of id will also be detected by the error message you saw, but of course you did not specify id so the above error is not reporting the correct issue here, hence a bug.

The third one, actually has not been directly checked via gamm. Ideally, as soon as the gamm / gam formula has been interpreted (by interpret.gam), sp should be reset to -1 if it is not readily is, and a warning message about this should be issued. However, this part is missing. Therefore, at the moment the best thing you can do is just not specifying sp.

Do you have valid model nesting?

Now let's turn to your concern on nesting. Nesting is related to basis set up rather than choice of smoothing parameter. As long as you have the same set of basis (same basis type, same number and / or location of "knots"), the model matrix will be the same. For example, in your model M0 and M1, you have the same configuration of the s() with mgcv default bs = 'tp', k = 10. So the design matrix for s() is the same in your two models. The by = factor(gender) just replicates this s() to all levels of gender in your main model M0. Perhaps it is not easily seen, but actually your M1 is nested in M0.

Let's consider a small example to verify this. For simplicity, I will not use s(x) from mgcv, but use poly(x, degree = 2) (imagining it is s(x)). Let's generate some toy data first:

x <- 1:10
f <- gl(2, 5, labels = c("M", "F"))

Since f is not an ordered factor, s(x, by = factor(f)) generates design matrix by replicating s(x) for all levels of f:

## original design matrix for `s(x)`
X0 <- poly(x, 2)
## design matrix for `f`, without contrasting
Xf <- model.matrix(~ f + 0)
## design matrix for level `M`
X1 <- X0 * Xf[, 1]
## design matrix for level `F`
X2 <- X0 * Xf[, 2]
## design matrix for `s(x, by = f)` "please, imagine it as `poly`"
X <- cbind(X1, X2)

#                1           2          1           2
# [1,] -0.49543369  0.52223297 0.00000000  0.00000000
# [2,] -0.38533732  0.17407766 0.00000000  0.00000000
# [3,] -0.27524094 -0.08703883 0.00000000  0.00000000
# [4,] -0.16514456 -0.26111648 0.00000000  0.00000000
# [5,] -0.05504819 -0.34815531 0.00000000  0.00000000
# [6,]  0.00000000  0.00000000 0.05504819 -0.34815531
# [7,]  0.00000000  0.00000000 0.16514456 -0.26111648
# [8,]  0.00000000  0.00000000 0.27524094 -0.08703883
# [9,]  0.00000000  0.00000000 0.38533732  0.17407766
#[10,]  0.00000000  0.00000000 0.49543369  0.52223297

Your second model M1, has only a smooth term s(x), the design matrix of which is X0.

Here is how we can see that your M1 is nested in M0:

As X1 + X2 = X0, design matrix of s(x) and s(x, by = f) have the same column span, thus s(x) is nested in s(x, by = f);
since x is nested in s(x), x:f is nested in s(x, by = f).

What I would do

Although you models are already nicely nested, the main model M0 does not have a comparable interpretation with your M1. Your main model M0 will end up with an independent smooth for each level, while your M1 focus on the difference between two groups.

It would be good if we could control M0 to admit a form of "reference smooth + difference smooth". Then, if the difference smooth turns out a line, without actually fitting M1 we already know there is no evidence for non-linear difference between groups.

In mgcv, difference smooth will be constructed, if your factor is ordered. So I suggest you fit your main model by:

gender1 <- ordered(gender)  ## create an ordered factor
s(x) + s(x, by = gender1) + gender

If estimation result shows the difference smooth s(x, by = gender1) as a line, you know you can instead fit a simpler model

s(x) + gender:x + gender

even without using F-test.

Note, it is very important to have an ordered factor by in order to construct "difference" smooth. If you do

s(x) + s(x, by = gender) + gender  ## note, it is "gender" in "by"

s(x) and s(x, by = gender) are completely linearly dependent. The resulting model matrix will be rank-deficient.

Some follow-up

I forgot to include in my example that I at first compared the same model parametrized as s(x, by = as.factor(gender)) and s(x) + s(x, by = gender) by AIC (recall gender is 0-1 numerical variable). These models are statistically equivalent, but the smoothing parameters are obviously estimated differently in the cases, and the AIC thus differ a bit.

Oh, yes. Your gender is binary so a numerical by is also a good idea for constructing a difference smooth. But do this with care. Numerical by does not yield a centred smooth. Therefore, s(x) + s(x, by = gender) will implicitly have an intercept column, confounded with the model intercept. You should go with s(x) + s(x, by = gender) - 1.