I have a data set of 1960 sample with 12 features and trying solve a binary classification problem using 980 sample for training and 980 sample for testing. For training i am using "svmtrain" and to classify i am using "svmclassify". Is it possible to plot training error VS training sample number? Any advice please? I'm new to matlab and SVM.
1 Answers
It is important to consider the difference between methods like the SVM and Neural Networks. When training a neural network, you start the training with initial weights and then go through you training samples step-by-step. You would basically have a for-loop going through each training sample and applying whatever training algorithm. It is therefore easy to evaluate the misclassification error in the training, e.g. after every 100 samples.
The SVM however works differently: When training an SVM, you construct one equation, which describes the entire problem. This is usually a form of
Minimize the norm of
||w||
, subject toy_i ( w * x_i - b) >= 1
for alli
training samples.
Where w
is the normal vector on the separating hyperplane, b
is an offset of the hyperplane to the origin, x_i
are the training samples and y_i
are the labels of the training samples. This is an optimization problem, so you search for a solution which minimizes ||w||
and once you have found such a w
, the algorithm is finished. But: all your training samples are used for the training, so you don't have a sample-by-sample procedure as in Neural Networks and the misclassification error can only be evaluated after the entire training with all samples has finished.
To be able to plot the misclassification error vs the number of training samples in a support vector machine, you will have to run the SVM function svmtrain
(or if you have a newer MATLAB version fitcsvm
) multiple times with a different number of training samples and evaluate the error.
One final note: If you only verify the training error, you leave the doors open for an overfitting. This means that your algorithm exactly learns the features of your training data, but can not generalize this knowledge to new data. You will thus have incredibly low error rates in training, but fail on new data. To prevent overfitting, you can use a small verification data set (maybe 5-10% of your training data) which you don't use for training. After training the SVM, you can evaluate the misclassification error on the verification data set and check that the error rates in the training data set and the verification data set are very similar, so you can be sure that new data can also be classified correctly.