0
votes

Currently I am doing a random forest by H2O package and have plotted a sample tree for presentation purpose. The prediction value of each node is not quite the same as probability of positive class over the all instances of the node.

Just wondering to know how H2O calculate the prediction value. I need a formula to derive this prediction! I know that random forest goes over the average of the trees' prediction. But how is this prediction calculated at each node of each tree?

Any help would be appreciated.

2
Any way by node, I mean leaf node😊 – mohammad

2 Answers

0
votes

See algorithm 15.1 from the Elements of Statistical Learning:

And then see the code for the implementation of the model training process in H2O-3:

Finally, the best way to understand how the actual generated model is used for producing scores is the genmodel MOJO implementation which you can find here (try using a java debugger to single-step through a call to score0()):

0
votes

I found a solution which returns the exact probability rate of train data set as the prediction value in the sample tree. you just need to set your code as follow: h2o.randomforest(sample_rate = 1, calibrate_model = TRUE, and calibration_frame = train )