I was wondering how to measure the performance of prediction (using the test dataset) of an mlr3
model? For example, if I create a knn model using mlr3
like so:
library("mlr3")
library("mlr3learners")
# get data and split into training and test
aq <- na.omit(airquality)
train <- sample(nrow(aq), round(.7*nrow(aq))) # split 70-30
aqTrain <- aq[train, ]
aqTest <- aq[-train, ]
# create model
aqT <- TaskRegr$new(id = "knn", backend = aqTrain, target = "Ozone")
aqL <- lrn("regr.kknn")
aqMod <- aqL$train(aqT)
I can measure the mean square error of the model predictions doing something like:
prediction <- aqL$predict(aqT)
measure <- msr("regr.mse")
prediction$score(measure)
But how do I incorporate the test data into this? That is, how do I measure the performance of predictions on the test data?
In the previous version of mlr
I could do something like; get the predictions using the test dataset and measure the performance of, say, the MSE or Rsquared values like so:
pred <- predict(aqMod, newdata = aqTest)
performance(pred, measures = list(mse, rsq))
Any suggestions as to how I can do this in mlr3
?
testRows
are the indices of the rows in the training data. However, this produces an error:Error: DataBackend did not return the queried rows correctly: 33 requested, 17 received
– Electrinonewdata
in the same way as for mlr. – Lars Kotthoffnewdata
. – Michel