I want to perform bagging using python scikit-learn. I want to combine RFE(), recursive feature selection algorithm. The step is like below.
- Make 30 subsets allowing redundant selection (bagging)
- Perform RFE for each data set
- Get output of each classification
- find top 5 features from each output
I tried to use BaggingClassifier approach like below, but it took a lot of time and may not seem to work. Using only RFE works without problems(rfe.fit()).
cf1 = LinearSVC()
rfe = RFE(estimator=cf1)
bagging = BaggingClassifier(rfe, n_estimators=30)
bagging.fit(trainx, trainy)
Also, step 4 may be difficult to find top feature, because Bagging classifier does not offer the attribute like ranking_ in RFE(). Is there some other good ways to achieve those 4 steps?