0
votes

I am a newby in H2O DAI, and I think it's wonderful. I've run several experiments with small sample CSV data, and most of the time I see GLM and GBM are used.

Can we see the full list of all algorithms provided with H2O DAI ?

I see the algorithms provided with H2O open source here, but is it the same with H2O DAI ?

One more question : Is there any way I can choose which algorithm to use manually ?

1

1 Answers

1
votes

Please note that H2O-3 is a separate open-source product and is not the same as H2O.ai's DAI product.

The best way to find the answer to all your questions is to look at the Driverless AI documentation:http://docs.h2o.ai/driverless-ai/latest-stable/docs/userguide/index.html

For your convenience I will post the answers to your questions, but for anyone coming across this question later on I would highly recommend just looking at the docs, since what I state now could quickly become outdated.

Can we see the full list of all algorithms provided with H2O DAI ? (answer in the FAQ)

Which algorithms are used in Driverless AI?

Features are engineered with a proprietary stack of Kaggle-winning statistical approaches including some of the most sophisticated target encoding and likelihood estimates based on groupings, aggregations and joins, but we also employ linear models, neural nets, clustering and dimensionality reduction models and many traditional approaches such as one-hot encoding etc.

On top of the engineered features, sophisticated models are fitted, including, but not limited to: XGBoost (both original XGBoost and 'lossguide' (LightGBM) mode), GLM, TensorFlow (including a TensorFlow NLP recipe based on CNN Deeplearning models), and RuleFit. More will continue to be added later.

In general, GBMs are the best single-shot algorithms. Since 2006, boosting methods have proven to be the most accurate for noisy predictive modeling tasks outside of pattern recognition in images and sound (https://www.cs.cornell.edu/~caruana/ctp/ct.papers/caruana.icml06.pdf). The advent of XGBoost and Kaggle only cemented this position.

Is there any way I can choose which algorithm to use manually ? (answer found in the Expert Settings Section):

To a certain extent yes, you can select which algorithms you want by using the expert settings described in the link above.