0
votes

I need to do an ANOVA on a random forest model. Calling the same code I use for glm and gam models doesn't with my rf model. What code shall I use to make it work?

I am using the sdm package from R to build my rf model. The rf model runs ok, however, I can't use the ANOVA function on the results.

MRF <- sdm(presence~.,data=dM,methods='rf',replication='sub',test.percent=20)

anova(MRF)

Error in UseMethod("anova") : no applicable method for 'anova' applied to an object of class "sdmModels

I have also tried this option :

m <-MRF@models$presence$rf$`1`@object

anova(m)

Error in UseMethod("anova") : no applicable method for 'anova' applied to an object of class "c('randomForest.formula', 'randomForest')"

1

1 Answers

0
votes

You can calculate Variable Importance instead to see which factors are the most important, similarly to what anova type II does for a logistic regression model:

library(caret)

trControl <- trainControl(classProbs = TRUE,
                          method = "cv",
                          number = 5,  # 5-fold cross validation
                          search ="grid",
                          savePredictions = TRUE,
                          summaryFunction = twoClassSummary)

tuneGrid <- expand.grid(.mtry = 18) # Configurable

rf_model <- train(presence~.,            
                          data = dM,
                          method = "rf",
                          tuneGrid = tuneGrid_competitive_jam,
                          trControl = trControl,
                          importance = TRUE,
                          maxnodes = 5,
                          nodesize = 14,
                          ntree = 100)

varImp(rf_model)