1
votes

I want to use my training data to train a random forest model, but some errors occured.

The error message as below:

Error in train.default(x, y, weights = w, ...) : 
At least one of the class levels is not a valid R variable name; This will cause errors when class probabilities
are generated because the variables names will be converted to  X1, X2, X3, X4, X5, X6, X7 . Please use factor 
levels that can be used as valid R variable names  (see ?make.names for help).

Below is my code:

rf.ctrl <- trainControl(method = "repeatedcv",
                    number = 10,
                    repeats = 10,
                    classProbs = TRUE,
                    summaryFunction = twoClassSummary)


set.seed(256)

#train the calssification model with random forest
rf.model <- train(as.factor(response) ~ .,data = trainvals,
              method = "rf",
              trControl = rf.ctrl,
              tuneLength = 10,
              metic = "ROC")

The structure of trainvals is :

enter image description here

The class level of response is 1,2,3,4,5,6,and 7.

1

1 Answers

0
votes

One or more of the columns in the trainvals data frame is not a factor type, hence the error you are getting. You can convert all columns to factor using the following:

trainvals[] <- lapply(trainvals, factor)