Regularization parameter setting for Randomized Regression in sklearn

Question

I am using sklearn Randomized Regression, such as Randomized Logistic Regression. Because randomized logistic regression uses L1-penalty, it is require to set the regularization parameter C(or alpha in Lasso).

To find good value for C, I usually used simple GridSearchCV like below.

But RandomizedLogisticRegression() does not support GridSearchCV, because it contains the bootstrapping. Instead, I tried to use typical LogisticRegression with GridSearchCV.

params = {'C':[0.1, 1, 10]}
logi = LogisticRegression(penalty='l1')
clf = GridSearchCV(logi, params, cv=10)

I could get C by this way, however, no attribute was selected when I apply this C value to Randomized logistic regression. Maybe the selected C by GridSearchCV was quite low.

So, I would like to know that whether there are any other good way for determining the fair value of C(or alpha), when using Randomized regression.

There was a similar question before, but I think that answer was for typical regression.

Can anyone give me an idea please?

Unfortunately, using LogisticRegressionCV() produced the similar result as GridSearchCV(). The best C value was too small, and the coefficients of each features were all 0. — ToBeSpecific

David Maust David Maust · Accepted Answer · 2015-12-25T19:04:14

Because RandomizedLogisticRegression is used for feature selection, it would need to be cross validated as part of a pipeline. You can apply GridSearchCV to a Pipeline which contains it as a feature selection step along with your classifier of choice. An example might look like:

pipeline = Pipeline([
  ('fs', RandomizedLogisticRegression()),
  ('clf', LogisticRegression())
])

params = {'fs__C':[0.1, 1, 10]}

grid_search = GridSearchCV(pipeline, params)

Regularization parameter setting for Randomized Regression in sklearn

1 Answers