I'm performing a gridsearch for GBM in h2o for a continuous outcome with continuous predictors. I'm using cross validation for training and then predict on a test set.
I'm using the function .predict_leaf_node_assignment:
best_gbm.predict_leaf_node_assignment(test_frame_h2o) (where best_gbm is the best gbm model I got from gridsearch)
and get the following table where we can see the leaf node assignments per tree T1, T2, T3 etc.
Question 1:
How can I get the values of T1, T2, T3 etc. per leaf in the below table and not the location of the leaf?
Question 2:
If there is a way to get the values for T1, T2, T3 etc. what do they actually reflect? Is the T1 the first prediction and then T2, T3, T4 are the corrections? Or T1 is the prediction and then T2 is T1 corrected etc.?
Thanks.
Edit: I tried to download mojo in python as explained in this page so that I can look into the different trees. http://docs.h2o.ai/h2o/latest-stable/h2o-docs/productionizing.html?highlight=mojo
In "Step 2: Compile and run the MOJO" the 2nd part of this step is given only in R: "Create your main program in the experiment folder by creating a new file called main.java (for example, using “vim main.java”). Include the following contents. Note that this file references the GBM model created above using R."
Can I do this in python? I have tried to copy for example the command "import java.io.*" in the jupyter notebook but it throws an error (ModuleNotFoundError: No module named 'java').