4
votes
> cv.ctrl <- trainControl(method = "repeatedcv", repeats = 3,
+                         summaryFunction = twoClassSummary,
+                         classProbs = TRUE)
> 
> set.seed(35)
> glm.tune.1 <- train(y ~ bool_3,
+                     data = train.batch,
+                     method = "glm",
+                     metric = "ROC",
+                     trControl = cv.ctrl)
Error in evalSummaryFunction(y, trControl, classLevels, metric, method) : 
  train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl()
In addition: Warning message:
In train.default(x, y, weights = w, ...) :
  cannnot compute class probabilities for regression


 > str(train.batch)
'data.frame':   128046 obs. of  42 variables:
 $ offer               : int  1194044 1194044 1194044 1194044 1194044 1194044 1194044 1194044 1194044 1194044 ...
 $ avgPrice            : num  2.68 2.68 2.68 2.68 2.68 ...
 ...
 $ bool_3              : int  0 0 0 0 0 0 0 1 0 0 ...
 $ y                   : num  0 1 0 0 0 1 1 1 1 0 ...

Since the cv.ctrl has classProbs set to TRUE, I do not understand why this error message appears.

Can someone advise?

2

2 Answers

7
votes

Apparently this error is due to the fact that my y is not a Factor.

Following code works fine:

library(caret)
library(mlbench)
data(Sonar)

ctrl <- trainControl(method = "cv", 
                     summaryFunction = twoClassSummary, 
                     classProbs = TRUE)
set.seed(1)
gbmTune <- train(Class ~ ., data = Sonar,
                 method = "gbm",
                 metric = "ROC",
                 verbose = FALSE,                    
                 trControl = ctrl)

Then doing:

Sonar$Class = as.numeric(Sonar$Class)

and same code throws the error:

> gbmTune <- train(Class ~ ., data = Sonar,
+                  method = "gbm",
+                  metric = "ROC",
+                  verbose = FALSE,                    
+                  trControl = ctrl)
Error in evalSummaryFunction(y, trControl, classLevels, metric, method) : 
  train()'s use of ROC codes requires class probabilities. See the classProbs option of trainControl()
In addition: Warning message:
In train.default(x, y, weights = w, ...) :
  cannnot compute class probabilities for regression

But caret train documentation says:

y   a numeric or factor vector containing the outcome for each sample.
1
votes

If you changed the values in y to "YES" and "NO" instead of 1 and zero respectively, then the code will run.

y=ifelse(train.batch$y==0,"No","Yes")