I am an R beginner and i have to do a 5 or 10-fold cross validation in a random forest model. My problem is i have to do the cv manually and not with an package. What i want to do is: 1. Building k-folds with my training data 2. Choose my tuning parameter for example trees = c( 200, 400, 600) 3. Fit my model on k-1 folds and predict my values on the holdout set(validation set) 4. Then i want to evaluate my prediction on the holdout set and save the value.
my evaluation parameter should be AUC. I understand the theory but i have problems to do this in R. Have you an idea for my code? Thanks so much!!!
It is a classification Problem so as an alternative is think the iris data set would work here too.
I stuck there that i don't se how i can fit the model on k-1 folds and predict the values on each validation set. Do i set i= 1, i=2, and so on? This is what i have already, but it doesn't work:
training.x = iris[, 1:4]; training.y = iris[, 5];
training$folds =
sample(1:5,nrow(training), replace=TRUE)
myGrid <- expand.grid
( ntrees = c(500, 1000, 2000),
mtry = c( 2, 4, 6, 8)
for (i in 1: 5){
newrf = randomForest(x = training.x[training$folds!=i,] , y = as.factor(training.y)
,tuneGrid = myGrid , importance = TRUE , do.trace = 10) new.pr = predict(newrf, training.mt.X[training$folds==i,], id= i)
err.vect[i] =roc.area(test, new.pr)$class
print(paste("AUC for fold", i, ":", err.vect[i]))}```
mtcars
), and try to explain where you're stuck. – Gregor Thomas