I want to apply a wrapper-method like Recursive Feature Elimination on my regression problem with scikit-learn. Recursive feature elimination with cross-validation gives a good overview, how to tune the number of features automatically.
I tried this:
modelX = LogisticRegression()
rfecv = RFECV(estimator=modelX, step=1, scoring='mean_absolute_error')
rfecv.fit(df_normdf, y_train)
print("Optimal number of features : %d" % rfecv.n_features_)
# Plot number of features VS. cross-validation scores
plt.figure()
plt.xlabel("Number of features selected")
plt.ylabel("Cross validation score (nb of correct classifications)")
plt.plot(range(1, len(rfecv.grid_scores_) + 1), rfecv.grid_scores_)
plt.show()`
but I receive an error message like
`The least populated class in y has only 1 members, which is too few.
The minimum number of labels for any class cannot be less than n_folds=3. % (min_labels, self.n_folds)), Warning)
The warning sounds like I have a classification problem, but my task is a regression problem. What can I do to get a result and what's wrong?
y_train
? – MMF