I obtained the following code from this Stack Overflow question. caret train() predicts very different then predict.glm()
The following code is producing an error. I am using caret 6.0-52.
library(car); library(caret); library(e1071)
#data import and preparation
data(Chile)
chile <- na.omit(Chile) #remove "na's"
chile <- chile[chile$vote == "Y" | chile$vote == "N" , ] #only "Y" and "N" required
chile$vote <- factor(chile$vote) #required to remove unwanted levels
chile$income <- factor(chile$income) # treat income as a factor
tc <- trainControl("cv", 2, savePredictions=T, classProbs=TRUE,
summaryFunction=twoClassSummary) #"cv" = cross-validation, 10-fold
fit <- train(chile$vote ~ chile$sex +
chile$education +
chile$statusquo ,
data = chile ,
method = "glm" ,
family = binomial ,
metric = "ROC",
trControl = tc)
Running this code produces the following error.
Something is wrong; all the ROC metric values are missing:
ROC Sens Spec
Min. : NA Min. :0.9354 Min. :0.9187
1st Qu.: NA 1st Qu.:0.9354 1st Qu.:0.9187
Median : NA Median :0.9354 Median :0.9187
Mean :NaN Mean :0.9354 Mean :0.9187
3rd Qu.: NA 3rd Qu.:0.9354 3rd Qu.:0.9187
Max. : NA Max. :0.9354 Max. :0.9187
NA's :1
Error in train.default(x, y, weights = w, ...) : Stopping
In addition: Warning message:
In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, :
There were missing values in resampled performance measures.
Would anyone know what the issue is or can reproduce/ not reproduce this error. I've seen other answers to this error message that says this has to do with not having representation of classes in each cross validation fold but this isn't the issue as the number of folds is set to 2.