H2O AutoML building a large number of GBM models

1

votes

I tried to use AutoML for a binary classification task with 100 hours. It appears that it is just building a large number of GBM models and not getting to other types. (So far built 40)

Is there a way to set the maximum number of GBM models?

h2o

4

votes

There is an order in which AutoML builds the models (the GBMs are first in line). The length of the GBM model building process will depend on how much time you set for max_runtime_secs. If you plan to run it for 100 hours, then a good portion of that will be spend in the GBM hyperparamter space, so I am not surprised that your first 40 models are GBMs. In other words, this is expected behavior.

If you want variety in your models as they are training, then you can run a single AutoML job for a smaller max_runtime_secs (say 2 hours), and then run the AutoML process again on that same project (49 more times at 2 hours each -- or some combination that adds up to 100 hours). If you use the same project_name when you start an AutoML job, a full new set of models (GBMs, RFs, DNNs, GLMs) should be added to the existing AutoML leaderboard.

0

votes

As Erin said, if you run AutoML multiple times with the same project_name the results will accumulate into a single leaderboard and the hyperparameter searches will accumulate into the same grid objects. However, AutoML will still run through the same sequence of model builds, so it will do a GBM hyperparameter search again before it gets to the DL model builds.

It feels like your GBM hyperparameter search isn't converging because the stopping_tolerance is too small for your dataset. There was a bug in pre-release versions of the bindings which forced the stopping_tolerance to 0.001 instead of letting AutoML set it higher, if it calculated that that tolerance was too tight for a small dataset. Which version of H2O-3 are you using?

A bit about stopping criteria:

The stopping_criteria such as max_models, stopping_rounds, and stopping_tolerance apply to the overall AutoML process as well as to the hyperparameter searches and the individual model builds. At the beginning of the run max_runtime_secs is used to compute the end time for the entire process, and then at each stage the remaining overall time is computed and is passed down to the model build or hyperparameter search subtask.

The Run Time 558:10:56.131 that you posted is really weird. I don't see that sort of output in the AutoML.java code nor in the Python or R bindings. It looks at first glance like this is coming from outside of H2O. . . Do you have any sense of what the real time was for this run?

We should be able to figure out what's going on if you do the following:

If you're not on the recent release version 3.14.x, please upgrade.
While we're debugging please set the seed parameter for your AutoML run so that we get repeatable results.
Please post your stopping criteria, your leaderboard output, your User Feedback output, and send your H2O logs to rpeck (at) h2o.ai and support (at) h2o.ai in case we need to delve further. You can grab the H2O logs from the server or download them using Flow.

H2O AutoML building a large number of GBM models

2 Answers