2
votes

I am trying to extract coefficients for a result of the makeStackedLearner function (mlr package) where the meta learner is fitting a GLM or similar. Knowing coefficients is very useful to see which individual models contribute the most to final predictions.

I have asked this questions via the mlr github issues (https://github.com/mlr-org/mlr/issues/2598).

library(mlr)
data(BostonHousing, package = "mlbench")
tsk = makeRegrTask(data = BostonHousing, target = "medv")
base = c("regr.rpart", "regr.svm")
lrns = lapply(base, makeLearner)
m = makeStackedLearner(base.learners = lrns,
predict.type = "response", method = "compress")
tmp = train(m, tsk)

Where to find the regression coefficients of the super.model?

> names(tmp$learner.model)
[1] "method"        "base.learners" "super.model"   "pred.train" 
1

1 Answers

1
votes

Thanks to the mlr package developers this has now been resolved. The easiest thing to do is to use the super.learner = "regr.glm" setting so that the final super learner model can be easily interpreted:

> data(BostonHousing, package = "mlbench")
> tsk = makeRegrTask(data = BostonHousing, target = "medv")
> BostonHousing$chas = as.numeric(BostonHousing$chas)
> base = c("regr.rpart", "regr.svm", "regr.ranger")
> lrns = lapply(base, makeLearner)
> m = makeStackedLearner(base.learners = lrns,
+                        predict.type = "response", method = "stack.cv", super.learner = "regr.glm")
> tmp = train(m, tsk)
> tmp$learner.model$super.model$learner.model

Call:  stats::glm(formula = f, family = family, data = d, control = ctrl, 
    model = FALSE)

Coefficients:
(Intercept)   regr.rpart     regr.svm  regr.ranger  
   -2.49071     -0.05411      0.28542      0.88404  

Degrees of Freedom: 505 Total (i.e. Null);  502 Residual
Null Deviance:      42720 
Residual Deviance: 5506     AIC: 2654