0
votes

I am new to SVM. I am using jlibsvm for a multi-class classification problem. Basically, I am doing a sentence classification problem. There are 3 Classes. What I understood is I am doing One-against-all classification. I have a comparatively small train set. A total of 75 sentences, In which 25 sentences belongs to each class.

I am making 3 SVMs (so 3 different models), where, while training, in SVM_A, sentences belong to CLASS A will have a true label, i.e., 1 and other sentences will have a -1 label. Correspondingly done for SVM_B, and SVM_C.

While testing, to get the true label of a sentence, I am giving the sentence to 3 models and I am taking the prediction probability returned by these 3 models. Which one returns the highest will be the class the sentence belong to.

This is how I am doing. But I am getting the same prediction probability for every sentence in the test set for all models.

A predicted:0.012820514
B predicted:0.012820514
C predicted:0.012820514

These values repeat for all sentences in the training set.

The following is how I set parameters for training:

C_SVC svm = new C_SVC();
MutableBinaryClassificationProblemImpl problem;

ImmutableSvmParameterGrid.Builder builder = ImmutableSvmParameterGrid.builder();

// create training parameters ------------
HashSet<Float> cSet;
HashSet<LinearKernel> kernelSet;

cSet = new HashSet<Float>();
cSet.add(1.0f);

kernelSet = new HashSet<LinearKernel>();
kernelSet.add(new LinearKernel());

// configure finetuning parameters

builder.eps = 0.001f; // epsilon
builder.Cset = cSet; // C values used
builder.kernelSet = kernelSet; //Kernel used
builder.probability=true; // To get the prediction probability
ImmutableSvmParameter params = builder.build();

What am I doing wrong?

Is there any other better way to do multi-class classification other than this?

1

1 Answers

2
votes

You are getting the same output, because you generate the same model three times.

The reason for this is, that jlibsvm is able to perform multiclass classification out of the box based on the provided data (LIBSVM itself supports this too). If it detects, that more than two class labes are provided in the given data, it automatically performs multiclass classification. So there is no need for a manually 1vsN approach. Just supply the data with class-labels for each category.

However, jlibsvm is still in beta and relies on a rather old version of LIBSVM (2.88). A lot has changed. For a more intiuitive Java binding (in comparison to the default LIBSVM version), you can take a look at zlibsvm, which is available via Maven Central and based on the latest LIBSVM version.