I am running H2O AutoML on data with 3000 observations (for binary classification) with 10% default. The AUC of the best model is very low (0.6 on the test data). How can I maximize it?
0
votes
1 Answers
3
votes
The AutoML algorithm has tried it's best on the data you gave it, however there are few things you can try:
- You can run the AutoML process for longer than you are currently running it by increasing
max_runtime_secs
. - It sounds like you have imbalanced data in your binary classification problem (where the minority class is 10%) so you could try setting
balance_classes
to True. - You can do doing some manual feature engineering on your data to transform existing features or create additional features.
- The best solution is to collect more training data (though that may not be possible).