LibSVM one class classification nu parameter is not a fraction of outliers?

Question

please correct me if I'm wrong, but the one class SVM theory states, that nu parameter is the upper bound (UB) of outliers in the training dataset and lower bound (LB) of number of SVs. Say I'm using RBF gaussian kernel, so by the idea of nu parameter, it does not matter what value of gamma I choose, the model should be able to produce results, such that the parameter nu is the UB of outliers in the training dataset? However, it's not what I've observed by trying out some simple example with LibSVM in Matlab:

[heart_scale_label, heart_scale_inst] = libsvmread('../heart_scale');
ind_good = (heart_scale_label==1);
heart_scale_label = heart_scale_label(ind_good);
heart_scale_inst = heart_scale_inst(ind_good);
train_data = heart_scale_inst;
train_label = heart_scale_label;
gamma= 0.01;
nu=0.01;
model = svmtrain(train_label, train_data, ['-s 2 -t 2 -n ' num2str(nu) ' -g ' num2str(gamma) ' -h 0']);
[predict_label_Tr, accuracy_Tr, dec_values_Tr] = svmpredict(train_label, train_data, model);
accuracy_Tr

using gamma = 0.01 I get the accuracy of training data as 97.50 using gamma = 100 I get the accuracy of training data as 42.50 Shouldn't the model overfit to the data to get the same fraction of outliers in the training dataset, when larger gamma is selected?

lennon310 lennon310 · Accepted Answer · 2014-02-20T03:09:20

Actually I discovered the same problem. The performance of SVMs usually also depends on the interaction of γ and nu. If fixing one parameter while trying to tune another, the learning curve seems even not monotone.

I draw three images on the training accuracy, testing accuracy (5-fold on heart_scale data), and their difference. γ ranges from 10^(-4) to 10^(1), and nu ranges from 10^(-3) to 10^(-1):

enter image description here

To observe more clearly on the small parameters, I implemented the logarithm on the γ and nu axis, see the figure below:

enter image description here

Basically the underfit is much more evident than overfit with the given 120 data.

EDIT

Tune epsilon value to 1e-8 to fill the gap shown in the figure above:

enter image description here

No obvious overfitting or underfitting at all! Seems a little bit counter-intuitive as the dependence of generalization error upon the parameters, probably due to the optimization algorithm used in libsvm rather than the 'true' solution...

LibSVM one class classification nu parameter is not a fraction of outliers?

1 Answers