All the accuracy values are missing with ranger and classProbs = TRUE

Question

library(dplyr)
library(caret)
library(doParallel)

cl <- makeCluster(3, outfile = '')
registerDoParallel(cl)
set.seed(2019)
fit1 <- train(x = X_train %>% head(1000) %>% as.matrix(),
              y = y_train %>% head(1000),
              method = 'ranger', 
              verbose = TRUE,
              trControl = trainControl(method = 'oob',
                                       verboseIter = TRUE,
                                       allowParallel = TRUE,
                                       classProbs = TRUE),
              tuneGrid = expand.grid(mtry = 2:3,
                                     min.node.size = 1, 
                                     splitrule = 'gini'),
              num.tree = 100,
              metric = 'Accuracy',
              importance = 'permutation')
stopCluster(cl)

The code above results in the error:

Aggregating results Something is wrong; all the Accuracy metric values are missing: Accuracy Kappa
Min. : NA Min. : NA
1st Qu.: NA 1st Qu.: NA
Median : NA Median : NA
Mean :NaN Mean :NaN
3rd Qu.: NA 3rd Qu.: NA
Max. : NA Max. : NA
NA's :2 NA's :2
ERROR: Stopping

I've already searched SO for this error and found out that there are many possible reason behind it. Unfortunetely, I didn't find anything applicable to my case. Here, the issue seems to be with classProbs = TRUE - when I remove this and default value of FALSE is used model is trained succesfully. However, I don't get why it may be a problem as according to documentation:

a logical; should class probabilities be computed for classification models (along with predicted values) in each resample?

Data sample:

X_train <- structure(list(V5 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V1 = c(41.5, 
5.3, 44.9, 58.7, 67.9, 56.9, 3.7, 43.4, 38.6, 34.2, 42.3, 29.1, 
27.6, 44.2, 55.6, 53.7, 48, 58.4, 54, 7.1, 35.9, 36, 61.2, 24.1, 
20.3, 10.8, 13, 69.4, 71.5, 45.6, 34.4, 17.1, 30.1, 68.9, 25.1, 
37.4, 55.5, 58.9, 49.8, 47.2, 29.5, 19.9, 24.1, 27, 33.3, 41.9, 
33.2, 27.9, 48.4, 41.2), V2 = c(33.1, 35.4, 66.2, 1.8, 5, -0.9, 
32.8, 35.8, 36, 4, 65.5, 64, 61, 68.9, 69.3, 59.7, 29.8, 24.4, 
62.7, 12.2, 6, -1.2, 63.5, 7.5, 22.9, 40.5, 47.3, 1.6, -1.5, 
33.3, 53.3, 23.7, 2.7, 61, 2.4, 13.5, 8.1, 55.1, 29.6, 36.8, 
26.8, 26, 30.8, 53.8, 10.6, 1.9, 10.2, 29.1, 51.4, 33.1), V3 = c(0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0), V4 = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 
1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -50L))
y_train <- structure(c(2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 
2L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 
1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L), .Label = c("plus", "minus"), class = "factor")

Maybe you should post this in the Data Science section. That way there's a greater chance that viewers will share your assumptions about what packages are loaded by default. — IRTFM

IRTFM IRTFM · Accepted Answer · 2019-04-08T01:14:12

Based on the responses to https://stats.stackexchange.com/questions/23763/is-there-a-way-to-disable-the-parameter-tuning-grid-feature-in-caret I tried following the advice to set the trainControl "method" to "none which allowed successful execution. The second answer answer implied that random forest methods should not use complicated grids. (I also set the 'mtry' parameter to a single value, but I'm not sure that was necessary.) (I had earlier attempted to remove the use of parallel clusters without any effect on the errors.) You can add back features now that you have code that doesn't throw errors.

fit1 <- train(form=y~., x = X_train[,2:3] ,
              y = factor(y_train) ,
              method = 'ranger', 
              verbose = TRUE,
              trControl=trainControl(method="none"),
              tuneGrid = expand.grid(mtry = 2,
                                     min.node.size = 1, 
                                     splitrule = 'gini'
                                     ),
              num.tree = 100,
              metric = 'Accuracy',
              importance = 'permutation')

All the accuracy values are missing with ranger and classProbs = TRUE

1 Answers