1
votes

I'm trying to run H2O's K-means algorithm on my data set using H2O 3.16.0.1 with R 3.3.3 as follows:

h2o.kmeans(x=covars, training_frame=mydata.h2o, k=3, fold_column='Fold')

But it results in the error:

ERRR on field: _fold_column: Fold column 'Fold' not found in the training frame

The "Fold" column is indeed in the data. After some experimenting, the error goes away only when "Fold" is added to the list of covariates in the "x=covars" even though it's technically not a covariate. Could this be a bug in the software?

1

1 Answers

1
votes

Yes, this is definitely not the expected behavior, so it's a bug. I filed a ticket for this here. Please use the work-around of putting the fold column name in the x vector for now. Thanks for the report!