0
votes

I'm new to h2o and I'm having difficulty with this package on r. I'm using a traning and test set 5100 and 2300 obs respectively with 18917 variables and a binary target (0,1) I ran a random forest:

train_h20<-as.h2o(train)
test_h20<-as.h2o(test)

forest <- h2o.randomForest(x = Words,
                           y = 18918,
                           training_frame = train_h20,
                           ntree = 250,
                           validation = test_h20,
                           seed = 8675309)

I know i can get the plot of logloss or mse or ... as the number of tree changes But is there a way to plot an image of the model itself. I mean the final ensembled tree used for the final predictions?

Also, another question, in randomForest package I could use varImp function which returned me, as well as the absolute importance, the class-specific measures (computed as mean decrease in accuracy), i interpreted as a class-relative measure of variable importance.

varImp matrix, randomForest package:

varImp matrix, randomForest package In h2o package I only find the absolute importance measure, is there something similar?

1
Best to ask your two questions as two questions. (See meta.stackexchange.com/a/39224/167713 for why.) There is no random forest plot for H2O; I've seen the advice to read the raw POJO output! (Either way, for anything but a toy model you are not going to want to do this.)Darren Cook

1 Answers

0
votes

There is no a final tree at the end of the random forest in R with randomForest packages. To make final predıction, random forest uses voting method. Voting means, for any data: For example 0;

of tree that predict the data as Class 0/total number of trees in the forest

For Class 1 it is same as the Class 0;

of tree that predict the data as Class 1/total number of trees in the forest

However you can use ctree. library("party") x <- ctree(Class ~ ., data=data) plot(x)