0
votes

I am running H2O AutoML on data with 3000 observations (for binary classification) with 10% default. The AUC of the best model is very low (0.6 on the test data). How can I maximize it?

1
With 2 computersWonka
Adding more CPUs will not improve model performance.Erin LeDell
@Adata If you found my answer sufficient, can you please accept the answer below?Erin LeDell

1 Answers

3
votes

The AutoML algorithm has tried it's best on the data you gave it, however there are few things you can try:

  • You can run the AutoML process for longer than you are currently running it by increasing max_runtime_secs.
  • It sounds like you have imbalanced data in your binary classification problem (where the minority class is 10%) so you could try setting balance_classes to True.
  • You can do doing some manual feature engineering on your data to transform existing features or create additional features.
  • The best solution is to collect more training data (though that may not be possible).