I tried to use AutoML for a binary classification task with 100 hours. It appears that it is just building a large number of GBM models and not getting to other types. (So far built 40)
Is there a way to set the maximum number of GBM models?
There is an order in which AutoML builds the models (the GBMs are first in line). The length of the GBM model building process will depend on how much time you set for max_runtime_secs
. If you plan to run it for 100 hours, then a good portion of that will be spend in the GBM hyperparamter space, so I am not surprised that your first 40 models are GBMs. In other words, this is expected behavior.
If you want variety in your models as they are training, then you can run a single AutoML job for a smaller max_runtime_secs
(say 2 hours), and then run the AutoML process again on that same project (49 more times at 2 hours each -- or some combination that adds up to 100 hours). If you use the same project_name
when you start an AutoML job, a full new set of models (GBMs, RFs, DNNs, GLMs) should be added to the existing AutoML leaderboard.
As Erin said, if you run AutoML
multiple times with the same project_name
the results will accumulate into a single leaderboard
and the hyperparameter searches will accumulate into the same grid
objects. However, AutoML
will still run through the same sequence of model builds, so it will do a GBM hyperparameter search again before it gets to the DL
model builds.
It feels like your GBM hyperparameter search isn't converging because the stopping_tolerance
is too small for your dataset. There was a bug in pre-release versions of the bindings which forced the stopping_tolerance to 0.001 instead of letting AutoML
set it higher, if it calculated that that tolerance was too tight for a small dataset. Which version of H2O-3 are you using?
A bit about stopping criteria:
The stopping_criteria
such as max_models
, stopping_rounds
, and stopping_tolerance
apply to the overall AutoML
process as well as to the hyperparameter searches and the individual model builds. At the beginning of the run max_runtime_secs
is used to compute the end time for the entire process, and then at each stage the remaining overall time is computed and is passed down to the model build or hyperparameter search subtask.
The Run Time 558:10:56.131
that you posted is really weird. I don't see that sort of output in the AutoML.java
code nor in the Python or R bindings. It looks at first glance like this is coming from outside of H2O. . . Do you have any sense of what the real time was for this run?
We should be able to figure out what's going on if you do the following:
seed
parameter for your AutoML run so that we get repeatable results.