0
votes

I am working on my artificial intelligence problem and I am following the instructions from this example:

Matlab Deep Learning Example

There, they use a support vector machine to classify:

classifier = fitcecoc(trainingFeatures, trainingLabels, ...
    'Learners', 'Linear', 'Coding', 'onevsall', 'ObservationsIn', 'columns');

I Tried this example with my own data set and It has an acurracy of 89.5% it works pretty well, But now I would like to try with my own SVM with my own settings instead of the default settings.

I read in the documentation that fitcecoc uses a SVM with a Linear Kernel by default, now I would like to try different kernels for instance Gaussian and Polynomial.

I know for the Machine learning course of coursera that SVM have a parameter ( Andrew NG refers to it as C) and also each kernel has it own parameter. Also I found info about the kernels parameters in this Mathworks URL:

Kernel paramters...

According to that link....

  • Gaussian kernel has its parameter SIGMA
  • And Polynomial Kernel has its paramter P which is the order of the polynomial func

So I wrote Down this code:

Oursvm = templateSVM('KernelFunction','polynomial');
classifier = fitcecoc(trainingFeatures, trainingLabels,'Learners',...
    Oursvm,'Coding', 'onevsall', 'ObservationsIn', 'columns');

Now, I would like to change the P parameter, In the Template SVM Doumentation I found that I can set it like this:

Oursvm = templateSVM('KernelFunction','polynomial','PolynomialOrder',9);

Template SVM

The default value is 3, but no matter which number I use for PolynomialOrder , the accurracy is always the same 3.2258 for p = 1 Or p = 2 or even p = 9

Isn't it weird?

  • What am I missing?

  • Also How can I set the SIGMA parameter for the gaussian kernel? because training with the default configuration the acurracy is very Low, And in the SVM template documentation they dont specify how to set this parameter clearly.

  • How can I set the C parameter of my SVM?

  • Finally I Have read that you need at least 10 times training samples than dimensions of the input data, how is it possible that the deep learning example uses only 201 samples (67 for each class, three classes total) if the dimensions of the input data is 4096?

1

1 Answers

1
votes

Andrew Ng describe your problematic on week7 kernels2 video:

Large C - gives lower bias, high variance(prone to overfitting)

Small C - gives higher bias, low variance(prone to underfitting)

Sigmas for Gaussian kernel are opposite:

Large Sigma - gives higher bias, low variance(prone to underfitting)

Small Sigma - gives lower bias, high variance(prone to overfitting)

So you could try to tune one parameter in time. And so as Andrew I don't see a reason for using polynomial kernels. Usually enogh linear and gaussian which depends of number examples and features. gl

For the last question, in case of low number of training examples and so much features you should try linear kernel