I am trying to learn how caret works by following Max Khun's Applied Predictive Modeling book, but was not able to understand how caret's confusionMatrix function works.
I trained the training data set (training[, fullSet]), which has 8190 rows and 1073 columns, by using glmnet as follows:
glmnGrid <- expand.grid(alpha = c(0, .1, .2, .4, .6, .8, 1),
lambda = seq(.01, .2, length = 40))
ctrl <- trainControl(method = "cv",
number = 10,
summaryFunction = twoClassSummary,
classProbs = TRUE,
index = list(TrainSet = pre2008),
savePredictions = TRUE)
glmnFit <- train(x = training[,fullSet],
y = training$Class,
method = "glmnet",
tuneGrid = glmnGrid,
preProc = c("center", "scale"),
metric = "ROC",
trControl = ctrl)
Then, I printed the confusion matrix from the fit:
glmnetCM <- confusionMatrix(glmnFit, norm = "none")
When I looked at the confusion matrix, I got the following result:
Reference
Prediction successful unsuccessful
successful 507 208
unsuccessful 63 779
But, I don't understand why the confusion table only has 1757 observations (1757 = 507 + 208 + 63 + 779) because caret's confusionMatrix.train documentation says that "When train is used for tuning a model, it tracks the confusion matrix cell entries for the hold-out samples." Since the training data set has 8190 rows and I used a 10-fold CV, I thought that the confusion matrix should be based on 819 data points (819 = 8190 / 10), which is not the case.
Clearly I don't fully understand how caret's trainControl or train works. Can somebody explain what I misunderstood?
Thanks so much for your help.
Young-Jin Lee