So, I am trying to train a model and test it using a random forest regression. My response variable is a numeric and I have 23 other variables which are a mix of numeric and characters. I am using the following block of code:
library(e1071)
library(dplyr)
library(class)
library(caret)
library(kernlab)
data=read.csv(choose.files())
set.seed(1)
mydata=data
n=dim(mydata)[1]
p=dim(mydata)[2]-1
x=mydata[,-3]
y=mydata[,3]
n_train=35
n_test=9
random_order=sample(n)
test_index=random_order[1:n_test]
train_index=random_order[-(1:n_test)]
y_train=y[train_index]
y_test=y[test_index]
x_train=x[train_index,]
x_test=x[test_index,]
traindata=data.frame(x=x_train,y=(y_train))
testdata = data.frame(x=x_test,y=(y_test))
fitControl <- trainControl(## 10-fold CV
method = "repeatedcv",classProbs=TRUE,
number = 10,
## repeated ten times
repeats = 10)
set.seed(1)
newrf=train(y ~ ., data = traindata , method = "rf",
trControl = fitControl)
newrf
bestmodel_rf= newrf$finalModel
ypredcaret=predict(bestmodel_rf, newdata = testdata)
table(predict=ypredcaret, truth=y_test)
plot(newrf)
bestmodel_rf
I am getting the following error:
Warning message: In train.default(x, y, weights = w, ...) : cannnot compute class probabilities for regression Warning message: In train.default(x, y, weights = w, ...) : cannnot compute class probabilities for regression