Best way to combine probabilistic classifiers in scikit-learn

Question

I have a logistic regression and a random forest and I'd like to combine them (ensemble) for the final classification probability calculation by taking an average.

Is there a built-in way to do this in sci-kit learn? Some way where I can use the ensemble of the two as a classifier itself? Or would I need to roll my own classifier?

You need to roll your own, there's no way to combine two arbitrary classifiers. — Matti Lyra
There are several ongoing PRs and open issues on the sklearn github which are working towards having ensemble meta-estimators. Unfortunately none of them have been merged. — Daniel
@user1507844 could you take a stab at a similar question here ? stackoverflow.com/questions/23645837/… — ekta

user1507844 user1507844 · Accepted Answer · 2014-02-04T04:53:12

NOTE: The scikit-learn Voting Classifier is probably the best way to do this now

OLD ANSWER:

For what it's worth I ended up doing this as follows:

class EnsembleClassifier(BaseEstimator, ClassifierMixin):
    def __init__(self, classifiers=None):
        self.classifiers = classifiers

    def fit(self, X, y):
        for classifier in self.classifiers:
            classifier.fit(X, y)

    def predict_proba(self, X):
        self.predictions_ = list()
        for classifier in self.classifiers:
            self.predictions_.append(classifier.predict_proba(X))
        return np.mean(self.predictions_, axis=0)

Best way to combine probabilistic classifiers in scikit-learn

4 Answers