6
votes

Is ensemble learning an example of many instances of a particular classifier, for example Decision Tree Classifier; or is it a mixture of couple of classifiers such as Neural Networks, Decision Tree, SVM and so forth?

I have looked into this wikipedia's description on Bagging an ensemble learner. It says that:

Bagging leads to "improvements for unstable procedures" (Breiman, 1996), which include, for example, neural nets, classification and regression trees, and subset selection in linear regression (Breiman, 1994).

I am little confused about this description. I also have looked into MATLAB's implementation of ensemble algorithm. For example this one:

load fisheriris
ens = fitensemble(meas,species,'AdaBoostM2',100,'Tree')

meas and species are inputs of the fitensemble function. Here in this example it is using AdaBoostM2 weak learner of type Tree and is using 100 of them. How can this simple instance of this function is being addressed to show that ensemble learning is used to combine different classifiers such as Neural Net, KNN, Naive Bayes together?

Can anybody explain what is ensemble learning actually and what is MATLAB trying to do in its implementation of fitensemble function?

1

1 Answers

12
votes

The basic idea of ensemble learning is to combine multiple models to improve prediction performance. They are considered meta-algorithms designed to work on top of existing learning algorithms. There are various approaches:

  • Bagging (stands for Bootstrap Aggregation) generates a set of models each trained on a random sampling of the data (bootstrap resampling: sample N instances with replacement). The predictions from those model are combined/aggregated to produce the final prediction using averaging.

  • Random Subspace: the idea is to randomize the learning algorithm, such as picking a subset of features at random before applying the training algorithm (think Random Forest for example). Each model is trained on data projected onto a randomly chosen subspace. The outputs of the models are then combined, usually by a simple majority vote.

  • Boosting: also built on the concept of voting/averaging multiple models, however it weights the models according to their performance. It constructs models in an iterative manner, where new models are encouraged to become "experts" for instances misclassified by earlier models. Boosting works best if the base learning algorithm is not too complex (weak learner). There are several variants of this algorithm (AdaBoost, LogitBoost, GentleBoost, LPBoost, etc..).

  • Stacking: combines the predictions of multiple base learners (usually of different types: kNN, ANN, SVM, etc..), not using voting as before, but using a meta-learner (a model trained on the output of the base models). So the predictions of the base learners are fed as input data to the meta-learner in the next layer to produce the final prediction.


fitensemble is a MATLAB function used to build an ensemble learner for both classification and regression. It supports three methods: bagging, boosting, and subspace. You can choose between three kinds of available weak learners: decision tree (decision stump really), discriminant analysis (both linear and quadratic), or k-nearest neighbor classifier.

Note: Except for Subspace method, all boosting and bagging algorithms are based on tree learners. Subspace can use either discriminant analysis or k-nearest neighbor learners.

For example, the following code trains a decision tree ensemble classifier (consisting of 100 trees) using the AdaBoost method fitted on the training dataset X with corresponding classes Y.

ens = fitensemble(X, Y, 'AdaBoostM1', 100, 'Tree')

(the M1 part indicates a binary classifier, there is an extended M2 version for multiclass problems)