I was trying to use xgboost for classification of the iris data, but face this error.
"Error in frankv(predicted) : x is a list, 'cols' can not be 0-length In addition: Warning message: In train.default(x_train, y_train, trControl = ctrl, tuneGrid = xgbgrid, : cannnot compute class probabilities for regression"
I am using the following code. Any help or explanation will be highly appreciated.
data(iris)
library(caret)
library(dplyr)
library(xgboost)
set.seed(123)
index <- createDataPartition(iris$Species, p=0.8, list = FALSE)
trainData <- iris[index,]
testData <- iris[-index,]
x_train = xgb.DMatrix(as.matrix(trainData %>% select(-Species)))
y_train = as.numeric(trainData$Species)
#### Generic control parametrs
ctrl <- trainControl(method="repeatedcv",
number=10,
repeats=5,
savePredictions=TRUE,
classProbs=TRUE,
summaryFunction = twoClassSummary)
xgbgrid <- expand.grid(nrounds = 10,
max_depth = 5,
eta = 0.05,
gamma = 0.01,
colsample_bytree = 0.75,
min_child_weight = 0,
subsample = 0.5,
objective = "binary:logitraw",
eval_metric = "error")
set.seed(123)
xgb_model = train(x_train,
y_train,
trControl = ctrl,
tuneGrid = xgbgrid,
method = "xgbTree")
y_train = as.numeric(trainData$Species)
. Also using thetwoClassSummary
function will not be appropriate since Species has three levels. Fix these two and you're good to go. UsemultiClassSummary
instead. Functions in this comment may not be in the correct case(lower/upper). – NelsonGonas.factor
notas.factor(as.numeric())
although Species is already a factor in the iris data set negating the need for that. I ran it without issues, didn't use your tune grid and also stopped the training as it would take a lot of time but it was going to work anyways. – NelsonGon