Regarding the gamm
error
This is a very interesting thing... Well, I should explain the logic first.
In principle it is illegal to fix a smoothing parameter in gamm
, because gamm
will treat the wiggly components of the smooth as random effects, the variance of which to be estimated by lme
(as you have Gaussian data). If you try to fix the smoothing parameter, that is essentially saying that you want to fix the variance of the random effect. Well, lme
does not allow you to do this (and I doubt whether such attempt is legal in mixed modelling, either.)
Therefore gamm
would disable any possible constraints on smoothing parameters, including:
- lower bound of smoothing parameter, via
min.sp
;
- linked smooth sharing the same smoothing parameter, via
id
in s()
;
- directly specified smoothing parameter via
sp
, via s()
.
The first two are perfectly checked. gamm
has no min.sp
argument like gam
; even if you specify it via ...
, there is no chance it would get used (as later it's NULL
that's passed to gam.setup
during gamm.setup
, so your specified min.sp
is ignored). Specification of id
will also be detected by the error message you saw, but of course you did not specify id
so the above error is not reporting the correct issue here, hence a bug.
The third one, actually has not been directly checked via gamm
. Ideally, as soon as the gamm
/ gam
formula has been interpreted (by interpret.gam
), sp
should be reset to -1
if it is not readily is, and a warning message about this should be issued. However, this part is missing. Therefore, at the moment the best thing you can do is just not specifying sp
.
Do you have valid model nesting?
Now let's turn to your concern on nesting. Nesting is related to basis set up rather than choice of smoothing parameter. As long as you have the same set of basis (same basis type, same number and / or location of "knots"), the model matrix will be the same. For example, in your model M0
and M1
, you have the same configuration of the s()
with mgcv
default bs = 'tp', k = 10
. So the design matrix for s()
is the same in your two models. The by = factor(gender)
just replicates this s()
to all levels of gender
in your main model M0
. Perhaps it is not easily seen, but actually your M1
is nested in M0
.
Let's consider a small example to verify this. For simplicity, I will not use s(x)
from mgcv
, but use poly(x, degree = 2)
(imagining it is s(x)
). Let's generate some toy data first:
x <- 1:10
f <- gl(2, 5, labels = c("M", "F"))
Since f
is not an ordered factor, s(x, by = factor(f))
generates design matrix by replicating s(x)
for all levels of f
:
## original design matrix for `s(x)`
X0 <- poly(x, 2)
## design matrix for `f`, without contrasting
Xf <- model.matrix(~ f + 0)
## design matrix for level `M`
X1 <- X0 * Xf[, 1]
## design matrix for level `F`
X2 <- X0 * Xf[, 2]
## design matrix for `s(x, by = f)` "please, imagine it as `poly`"
X <- cbind(X1, X2)
# 1 2 1 2
# [1,] -0.49543369 0.52223297 0.00000000 0.00000000
# [2,] -0.38533732 0.17407766 0.00000000 0.00000000
# [3,] -0.27524094 -0.08703883 0.00000000 0.00000000
# [4,] -0.16514456 -0.26111648 0.00000000 0.00000000
# [5,] -0.05504819 -0.34815531 0.00000000 0.00000000
# [6,] 0.00000000 0.00000000 0.05504819 -0.34815531
# [7,] 0.00000000 0.00000000 0.16514456 -0.26111648
# [8,] 0.00000000 0.00000000 0.27524094 -0.08703883
# [9,] 0.00000000 0.00000000 0.38533732 0.17407766
#[10,] 0.00000000 0.00000000 0.49543369 0.52223297
Your second model M1
, has only a smooth term s(x)
, the design matrix of which is X0
.
Here is how we can see that your M1
is nested in M0
:
- As
X1 + X2 = X0
, design matrix of s(x)
and s(x, by = f)
have the same column span, thus s(x)
is nested in s(x, by = f)
;
- since
x
is nested in s(x)
, x:f
is nested in s(x, by = f)
.
What I would do
Although you models are already nicely nested, the main model M0
does not have a comparable interpretation with your M1
. Your main model M0
will end up with an independent smooth for each level, while your M1
focus on the difference between two groups.
It would be good if we could control M0
to admit a form of "reference smooth + difference smooth". Then, if the difference smooth turns out a line, without actually fitting M1
we already know there is no evidence for non-linear difference between groups.
In mgcv
, difference smooth will be constructed, if your factor is ordered. So I suggest you fit your main model by:
gender1 <- ordered(gender) ## create an ordered factor
s(x) + s(x, by = gender1) + gender
If estimation result shows the difference smooth s(x, by = gender1)
as a line, you know you can instead fit a simpler model
s(x) + gender:x + gender
even without using F-test
.
Note, it is very important to have an ordered factor by
in order to construct "difference" smooth. If you do
s(x) + s(x, by = gender) + gender ## note, it is "gender" in "by"
s(x)
and s(x, by = gender)
are completely linearly dependent. The resulting model matrix will be rank-deficient.
Some follow-up
I forgot to include in my example that I at first compared the same model parametrized as s(x, by = as.factor(gender))
and s(x) + s(x, by = gender)
by AIC (recall gender
is 0-1 numerical variable). These models are statistically equivalent, but the smoothing parameters are obviously estimated differently in the cases, and the AIC thus differ a bit.
Oh, yes. Your gender
is binary so a numerical by
is also a good idea for constructing a difference smooth. But do this with care. Numerical by
does not yield a centred smooth. Therefore, s(x) + s(x, by = gender)
will implicitly have an intercept column, confounded with the model intercept. You should go with s(x) + s(x, by = gender) - 1
.