Explanation
library(mgcv)
#Loading required package: nlme
#This is mgcv 1.8-24. For overview type 'help("mgcv-package")'.
f1 <- ~ s(x, bs = 'cr', k = -1)
f2 <- ~ mgcv::s(x, bs = 'cr', k = -1)
OK <- mgcv:::interpret.gam0(f1)$smooth.spec
FAIL <- mgcv:::interpret.gam0(f2)$smooth.spec
str(OK)
# $ :List of 10
# ..$ term : chr "x"
# ..$ bs.dim : num -1
# ..$ fixed : logi FALSE
# ..$ dim : int 1
# ..$ p.order: logi NA
# ..$ by : chr "NA"
# ..$ label : chr "s(x)"
# ..$ xt : NULL
# ..$ id : NULL
# ..$ sp : NULL
# ..- attr(*, "class")= chr "cr.smooth.spec"
str(FAIL)
# list()
The 4th line of the source code of interpret.gam0
reveals the issue:
head(mgcv:::interpret.gam0)
1 function (gf, textra = NULL, extra.special = NULL)
2 {
3 p.env <- environment(gf)
4 tf <- terms.formula(gf, specials = c("s", "te", "ti", "t2",
5 extra.special))
6 terms <- attr(tf, "term.labels")
Since "mgcv::s"
is not to be matched, you get the problem. But mgcv
does allow you the room to work around this, by passing "mgcv::s"
via argument extra.special
:
FIX <- mgcv:::interpret.gam0(f, extra.special = "mgcv::s")$smooth.spec
all.equal(FIX, OK)
# [1] TRUE
It is just that this is not user-controllable at high-level routine:
head(mgcv::gam, n = 10)
#1 function (formula, family = gaussian(), data = list(), weights = NULL,
#2 subset = NULL, na.action, offset = NULL, method = "GCV.Cp",
#3 optimizer = c("outer", "newton"), control = list(), scale = 0,
#4 select = FALSE, knots = NULL, sp = NULL, min.sp = NULL, H = NULL,
#5 gamma = 1, fit = TRUE, paraPen = NULL, G = NULL, in.out = NULL,
#6 drop.unused.levels = TRUE, drop.intercept = NULL, ...)
#7 {
#8 control <- do.call("gam.control", control)
#9 if (is.null(G)) {
#10 gp <- interpret.gam(formula) ## <- default to extra.special = NULL
I agree with Ben Bolker. It is a good exercise to dig out what happens inside, but is an over-reaction to consider this as a bug and fix it.
More insight:
s
, te
, etc. in mgcv
does not work in the same logic with stats::poly
and splines::bs
.
- When you do for example,
X <- splines::bs(x, df = 10, degree = 3)
, it evaluates x
and create a design matrix X
directly.
- When you do
s(x, bs = 'cr', k = 10)
, no evaluation is made; it is parsed.
Smooth construction in mgcv
takes several stages:
- parsing / interpretation by
mgcv::interpret.gam
, which generates a profile for a smoother;
- initial construction by
mgcv::smooth.construct
, which sets up basis / design matrix and penalty matrix (mostly done at C-level);
- secondary construction by
mgcv::smoothCon
, which picks up "by" variable (duplicating smooth for factor "by", for example), linear functional terms, null space penalty (if you use select = TRUE
), penalty rescaling, centering constraint, etc;
- final integration by
mgcv:::gam.setup
, which combines all smoothers together, returning a model matrix, etc.
So, it is a far more complicated process.
s()
for anything other than a smooth term in agam()
formula, that would be pretty crazy ... – Ben Bolkers
in other modellng packages and, therefore, wanted to be clear. I guess that as soon asgam
from mcgv is called, the call would probably parse everything within the model call intepreted as syntax frommcgv
. So themgcv::
in front ofs
is not necessary. Still, using this notation should not result in an error to be consistent with the general notation. But since I know I should not use this notation in this specific case I will follow the doctor`s advice ;-). – Manuel Bickels()
lurking ahead ofmgcv:s
on the search path will interfere, manipulating formulas is one of the very hard things you can attempt to do with R. – Gavin Simpson