2
votes

I am using random forest in h2o. But I don't understand the meaning of the parameters in the returned result. This is my original data. enter image description here

I would have liked to see results like this: (I set number of trees = 3 and response column = "Play".)

tree1:
Wind = false: yes {no=0, yes=6}
Wind = true
|   Temperature > 77.500: no {no=2, yes=0}
|   Temperature ≤ 77.500: yes {no=1, yes=5}

tree2:
Humidity > 92.500: no {no=3, yes=0}
Humidity ≤ 92.500: yes {no=2, yes=9}

tree3:
Wind = false: yes {no=0, yes=6}
Wind = true
|   Temperature > 77.500: no {no=2, yes=0}
|   Temperature ≤ 77.500: yes {no=1, yes=5}

But I got a model contains many parameters but results. This is my code and the results I got:

    DRFParametersV3 drfParams = new DRFParametersV3();
    drfParams.trainingFrame = H2oApi.stringToFrameKey("train");
    drfParams.validationFrame = H2oApi.stringToFrameKey("test");
    drfParams.ntrees=3;
    System.out.println("drfParams: " + drfParams);

    ColSpecifierV3 responseColumn = new ColSpecifierV3();
    responseColumn.columnName = ATT_LABEL_GOLF;
    drfParams.responseColumn = responseColumn;
    System.out.println("About to train DRF. . .");

    DRFV3 drfBody = h2o.train_drf(drfParams);
    System.out.println("drfParams: " + drfBody);

    JobV3 job = h2o.waitForJobCompletion(drfBody.job.key);
    System.out.println("DRF build done.");

    ModelKeyV3 modelKey = (ModelKeyV3)job.dest;
    ModelsV3 models = h2o.model(modelKey);
    System.out.println("models: " + models);
    System.out.println("models'size: " + models.models.length);

    DRFModelV3 model = (DRFModelV3)models.models[0];
    System.out.println("new DRF model: " + model);

And the result "DRFModelV3" is so confused. Where is the "forest" build by h2o? enter image description here

1
this question is very similar to this question: stackoverflow.com/questions/37017165/…. you can also take a look at this blog post: aichamp.wordpress.com/2017/09/27/…Lauren
But I don't need to plot it, so I wouldn't use some plot util class in h2o. I need to build a "tree" in java by myself, so I need the data of drf results generated by h2o. I am still not understanding how to get the real data in the "model" in h2o.liyuhui
Do you have any example in java?liyuhui

1 Answers

1
votes

One options is to download the MOJO, load it and use function _computeGraph on the MOJO object. Take a look at the H2O github repo to learn from the code.

please also take a look at the documentation on the POJOs and MOJOs here

Here some additional code that might help: https://github.com/h2oai/h2o-3/blob/43f8ab952a69a8bc9484bd0ffac909b6e3e820ca/h2o-algos/src/test/java/hex/XValPredictionsCheck.java#L59-L69