I am new to WEKA tool. Can i combine classification and clustering? i.e first cluster the data and then classify the instances cluster wise. for this requirement what are the steps are need to follow.
Thanks in advance.
Yes you can. It is really easy with the ClassificationViaClustering classifier (Class ClassificationViaClustering).
Steps in Java pseudocode:
1. Create a SimpleKMeans clusterer
SimpleKMeans skm = new SimpleKMeans();
skm.setNumClusters(5); // in this example the clusterer uses 5 clusters
2. Read the dataset and set class index
BufferedReader reader = new BufferedReader(new FileReader("[path].arff")); // replace [path] with your path to dataset
Instances data = new Instances(reader);
data.setClassIndex([your class index]); // if the first attribute is your class, then insert 0
3. Create the classifier
ClassifierViaClustering cvc = new ClassificationViaClustering();
cvc.setClusterer(skm); // let your classifier use the SimpleKMeans clusterer
cvc.buildClassifier(data);
Then, when you want to classify an new instance:
Instance instanceToClassify = new Instance(data.firstInstance());
instanceToClassify.setDataset(data); // the instance to be classified has to have access to the dataset
double class = cvc.classifyInstance(instanceToClassify); // classify instance based by the cluster it belongs to