1
votes

I'm using xgboost within H2O for a binary classification task. The dataset has several categorical features, to which the model applies a one-hot encoding during training.

Now I want to use SHAP (https://github.com/slundberg/shap) to locally interpret the predictions. For this, it would be nice to have the dataframe with the one-hot encoded columns and values. However, I seem to find no way to get this from the H2O model.

I could probably manually recreate the one-hot encoding, but maybe someone know a quicker solution?

1

1 Answers

1
votes

We have had a ticket open for this for a while, but we will re-visiting this soon due to increased demand for this feature. For now, you will have to convert your H2OFrame to a Pandas DataFrame using the as_data_frame() method and then apply one of the following solutions.