1
votes

The following code:

library(randomForest)
z.auto <- randomForest(Mileage ~ Weight, 
                       data=car.test.frame,
                       ntree=1,
                       nodesize = 15)
tree <- getTree(z.auto,k=1,labelVar = T)
tree

Gives this as text output:

   left daughter right daughter split var split point status prediction
1              2              3    Weight      2567.5     -3   24.45000
2              0              0      <NA>         0.0     -1   30.66667
3              4              5    Weight      3087.5     -3   22.37778
4              6              7    Weight      2747.5     -3   24.00000
5              8              9    Weight      3637.5     -3   19.94444
6              0              0      <NA>         0.0     -1   25.20000
7             10             11    Weight      2770.0     -3   23.29412
8              0              0      <NA>         0.0     -1   21.18182
9              0              0      <NA>         0.0     -1   18.00000
10             0              0      <NA>         0.0     -1   22.50000
11             0              0      <NA>         0.0     -1   23.72727

From this data I can see the logic of an individual tree.

How do I get the much longer table, based on this, that describes all the trees in a random forest, from h2o?

I like 'h2o' because it cleanly uses all the cores, and goes at a pretty good clip on my system. It is a nice tool. It is, however, a library separate from 'r' so I am unsure how to access various parts of my data.

How do I get something like the above printed output, in the form of a csv file, from an h2o random forest?

1

1 Answers

1
votes

H2O doesn't currently have a function to display a table like that, but you can export the random forest model to POJO (a Java file) using the h2o.download_pojo() function and then inspect the tree (individual rules) manually.

H2O also accepts feature requests.