I was wondering if there's a way to train the model using Naive Bayes, and then apply that to a single record. I'm new to weka so I dont know if this is possible. Also, is there a way to store the classifier output in a file?
3 Answers
The answer is yes since Naive Bayes is a model based on simple probabilistic Bayes theorem that can be used for classification challenges.
For classification using Naive Bayes, and other classifiers, you need to first train the model with a sample dataset, once trained the model can be applied to any record.
Of course there will be always an error probability when using this approach, but that depends mostly on the quality of your sample and the properties of your data set.
I haven't used Weka directly, but as an extension for Rapid Miner, but the principles must apply. Once the model is trained you should be able to see/print the model parameters.
I am currently searching for the same answer, while using java.
I created an arff file, which contains training date and used the programm http://weka.wikispaces.com/file/view/WekaDemo.java as an example to train and evaluate the classifer.
I still need to figure out, howto save and load a model in java and (more importantly) how to test against a single record.
WekaDemo.java
...
public void execute() throws Exception {
// run filter
m_Filter.setInputFormat(m_Training);
Instances filtered = Filter.useFilter(m_Training, m_Filter);
// train classifier on complete file for tree
m_Classifier.buildClassifier(filtered);
// 10fold CV with seed=1
m_Evaluation = new Evaluation(filtered);
m_Evaluation.crossValidateModel(
m_Classifier, filtered, 10, m_Training.getRandomNumberGenerator(1));
//TODO Save model
//TODO Load model
//TODO Test against a single information
}
...
Edit 1:
Save and loading a model is explained here: How to test existing model with new instance in weka, using java code?
In http://weka.wikispaces.com/Use+WEKA+in+your+Java+code#Classification-Classifying%20instances there is a quick how to for classifying a single instance.
//load model (saved from user interface)
Classifier tree = (Classifier) weka.core.SerializationHelper.read("/some/where/j48.model");
// load unlabeled data
Instances unlabeled = new Instances( new BufferedReader(new FileReader("/some/where/unlabeled.arff")));
// set class attribute
unlabeled.setClassIndex(unlabeled.numAttributes() - 1);
// create copy
Instances labeled = new Instances(unlabeled);
// label instances
for (int i = 0; i < unlabeled.numInstances(); i++) {
double clsLabel = tree.classifyInstance(unlabeled.instance(i));
labeled.instance(i).setClassValue(clsLabel);
System.out.println(clsLabel + " -> " + unlabeled.classAttribute().value((int) clsLabel));
double[] dist =tree.distributionForInstance(unlabeled.instance(i))
for(int j=0; j<dist.length;j++){
System.print(unlabeled.classAttribute().value(j)+": " +dist[j]);
}
}
Edit This method doesn't train, evaluate and save a model. This is something I usually do using the weka gui. ( http://weka.wikispaces.com/Serialization ) This method uses a tree type model in the example with a nominal class, but that should be easily converted to a Naive Bayes example.