Prediction value in sample tree made by H2O random forest

Question

Currently I am doing a random forest by H2O package and have plotted a sample tree for presentation purpose. The prediction value of each node is not quite the same as probability of positive class over the all instances of the node.

Just wondering to know how H2O calculate the prediction value. I need a formula to derive this prediction! I know that random forest goes over the average of the trees' prediction. But how is this prediction calculated at each node of each tree?

Any help would be appreciated.

TomKraljevic TomKraljevic · Accepted Answer · 2019-03-26T14:26:14

See algorithm 15.1 from the Elements of Statistical Learning:

https://web.stanford.edu/~hastie/Papers/ESLII.pdf

And then see the code for the implementation of the model training process in H2O-3:

https://github.com/h2oai/h2o-3/blob/master/h2o-algos/src/main/java/hex/tree/drf/DRF.java

Finally, the best way to understand how the actual generated model is used for producing scores is the genmodel MOJO implementation which you can find here (try using a java debugger to single-step through a call to score0()):

https://github.com/h2oai/h2o-3/blob/master/h2o-genmodel/src/main/java/hex/genmodel/algos/drf/DrfMojoModel.java

Prediction value in sample tree made by H2O random forest

2 Answers