Combining multiple saved Classifier in weka

Question

Have huge distributed datasets which are trained to produce classifiers.All the datasets have identical attributes and the training is done using a single algorithm J48. The problem I am facing is as to how would combine these classifiers to have a single classifier which can be used for testing and predicting data. I am using weka tool for the code.Have converted the weka jar to dll.Using C# language. Any help in C# or Java would be of great help. If any additional information is needed you are free to ask. Thanks

shirowww shirowww · Accepted Answer · 2015-09-14T12:17:08

It is perfectly possible to do what you are asking for. You could build N different classifiers from N different but compatible datasets and combine their outputs to form a new dataset of higher order. Its a hierarchical way of combining classifiers and there is a great variety in ways of doing that. Its called 'ensembling' or 'classifier ensemble'. There are a large number of technical articles detailing how to do it.

One approach would be: 1. Train/get N different classifiers. 2. Build a new dataset with its probability output for a known set of instances, one instance per row, the set-of-output-probalities per set of columns. And the right/known class. 3. Throw away the old attributes and retain only the output probs calculated and known class. 4. Train a new model/classifier with this higher order dataset (don't need to use the whole data, only a moderate subsample). 5. For every new instance, get lower level probabilities (using N classifiers), as previously done, and apply higher level classifier over these newly constructed instance.

Hope to have helped.

Combining multiple saved Classifier in weka

2 Answers