I want to incorporate the function confusionMatrix()
in the caret package
into the function shuffle100
to produce confusion matrices from subsets (dataframes) of a master-list produced from classification tree models. My aim is to produce confusion matrix statistics such as classification accuracy, kappa metric etc (desired output below). I am sorry to ask such a simple question but I cannot figure this out. If anyone can help, then many thanks in advance.
Reproducible dummy data can be found at this address:
Code to produce a nested list of classification tree model predictions and confusion matrices
library(caret)
library(e1071)
library(rpart)
set.seed(1235)
shuffle100 <-lapply(seq(10), function(n){ #produce 10 different shuffled data-frames
subset <- my_data[sample(nrow(my_data), 80),] #shuffle 80 rows in the data-frame
subset_idx <- sample(1:nrow(subset), replace = FALSE)
subset <- subset[subset_idx, ]
subset_resampled_idx <- createDataPartition(subset_idx, times = 1, p = 0.7, list = FALSE) #partition data-frame into 70 % training and 30 % test subsets
subset_resampled <- subset[subset_resampled_idx, ] #70 % training data
ct_mod<-rpart(Family~., data=subset_resampled, method="class", control=rpart.control(cp=0.005)) #10 ct models
ct_pred<-predict(ct_mod, newdata=subset[,2:13])
confusionMatrix(ct_pred, norm$Family)#10 confusion matrices
})
Error messages
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?
Called from: sort.list(y)
Desired outcome
Confusion Matrix and Statistics
Reference
Prediction G8 V4
G8 42 12
V4 8 18
Accuracy : 0.75
95% CI : (0.6406, 0.8401)
No Information Rate : 0.625
P-Value [Acc > NIR] : 0.01244
Kappa : 0.4521
Mcnemar's Test P-Value : 0.50233
Sensitivity : 0.8400
Specificity : 0.6000
Pos Pred Value : 0.7778
Neg Pred Value : 0.6923
Prevalence : 0.6250
Detection Rate : 0.5250
Detection Prevalence : 0.6750
Balanced Accuracy : 0.7200
'Positive' Class : G8