5
votes
  Linearly Non-Separable Binary Classification Problem

First of all, this program isn' t working correctly for RBF ( gaussianKernel() ) and I want to fix it.

It is a non-linear SVM Demo to illustrate classifying 2 class with hard margin application.

  • Problem is about 2 dimensional radial random distrubuted data.

  • I used Quadratic Programming Solver to compute Lagrange multipliers (alphas)

xn    = input .* (output*[1 1]);    % xiyi
phi   = gaussianKernel(xn, sigma2); % Radial Basis Function

k     = phi * phi';                 % Symmetric Kernel Matrix For QP Solver
gamma = 1;                          % Adjusting the upper bound of alphas
f     = -ones(2 * len, 1);          % Coefficient of sum of alphas
Aeq   = output';                    % yi
beq   = 0;                          % Sum(ai*yi) = 0
A     = zeros(1, 2* len);           % A * alpha <= b; There isn't like this term
b     = 0;                          % There isn't like this term
lb    = zeros(2 * len, 1);          % Lower bound of alphas
ub    = gamma * ones(2 * len, 1);   % Upper bound of alphas

alphas = quadprog(k, f, A, b, Aeq, beq, lb, ub);
  • To solve this non linear classification problem, I wrote some kernel functions such as gaussian (RBF), homogenous and non-homogenous polynomial kernel functions.

For RBF, I implemented the function in the image below:

Gaussian Kernel

Using Tylor Series Expansion, it yields:

RBG with Tylor Expansion

And, I seperated the Gaussian Kernel like this:

K(x, x') = phi(x)' * phi(x')

The implementation of this thought is:

function phi = gaussianKernel(x, Sigma2)

gamma   = 1 / (2 * Sigma2);
featDim = 10; % Length of Tylor Series; Gaussian Kernel Converge 0 so It doesn't have to Be Inf Dimension
phi     = []; % Kernel Output, The Dimension will be (#Sample) x (featDim*2)

    for k = 0 : (featDim - 1) 

        % Gaussian Kernel Trick Using Tylor Series Expansion
        phi = [phi, exp( -gamma .* (x(:, 1)).^2) * sqrt(gamma^2 * 2^k / factorial(k)) .* x(:, 1).^k, ...
            exp( -gamma .* (x(:, 2)).^2) * sqrt(gamma^2 * 2^k / factorial(k)) .* x(:, 2).^k];
    end

end

*** I think my RBF implementation is wrong, but I don' t know how to fix it. Please help me here.

Here is what I got as output:

Samples of ClassesMarking The Support Vectors of Classes

Adding Random Test DataClassification

where,

1) The first image : Samples of Classes
2) The second image : Marking The Support Vectors of Classes
3) The third image : Adding Random Test Data
4) The fourth image : Classification

Also, I implemented Homogenous Polinomial Kernel " K(x, x') = ( )^2 ", code is:

function phi = quadraticKernel(x)

    % 2-Order Homogenous Polynomial Kernel
    phi = [x(:, 1).^2, sqrt(2).*(x(:, 1).*x(:, 2)), x(:, 2).^2];

end

And I got surprisingly nice output:

quadratic kernel output 1quadratic kernel output 1

To sum up, the program is working correctly with using homogenous polynomial kernel but when I use RBF, it isn' t working correctly, there is something wrong with RBF implementation.

If you know about RBF (Gaussian Kernel) please let me know how I can make it right..

Edit: If you have same issue, use RBF directly that defined above and dont separe it by phi.

2
Why do you use hard margin? As far as I know, using hard margin is often easily to make mistakes on a single class. Btw have you tuned the parameters ?Jake0x32
I haven' t got experience like you mentioned; but because of setting up the problem, I generated sample data for seperable without error so Support Vector Machine might be able to classify the classes without defining any kind of error, that is why I used hard margin. And, the variance of RBF Kernel ("sigma2 = 2" in the program) is big for this application, I know, but I can' t adjust this parameter. I think here the problem is be due to my gaussianKernel() function. I must have implemented it wrongly, and I don' t know how to correct it..mehmet
As far as I know, when using Rbf kernel we may always try grid search by trying exponentially increasing parameters, on both C and gamma, where there is a 2-D grid used to search the best model. I am always confident with the capacity of Gaussian Kernel as long as it has a good parameter.Jake0x32

2 Answers

1
votes

Since Gaussian kernel is often referred as mapping to infinity dimensions, I always have faith in its capacity. The problem here maybe due to a bad parameter while keeping in mind grid search is always needed for SVM training. Thus I propose you could take a look at here where you could find some tricks for parameter tuning. Exponentially increasing sequence is usually used as candidates.

1
votes

Why do you want to compute phi for Gaussian Kernel? Phi will be infinite dimensional vector and you are bounding the terms in your taylor series to 10 when we don't even know whether 10 is enough to approximate the kernel values or not! Usually, the kernel is computed directly instead of getting phi (and the computing k). For example [1].

Does this mean we should never compute phi for Gaussian? Not really, no, but we have to be slightly smarter about it. There have been recent works [2,3] which show how to compute phi for Gaussian so that you can compute approximate kernel matrices while having just finite dimensional phi's. Here [4] I give the very simple code to generate the approximate kernel using the trick from the paper. However, in my experiments I needed to generate anywhere from 100 to 10000 dimensional phi's to be able to get a good approximation of the kernel (depending upon on the number of features the original input had as well as the rate at which the eigenvalues of the original matrix tapers off).

For the moment, just use code similar to [1] to generate the Gaussian kernel and then observe the result of SVM. Also, play around with the gamma parameter, a bad gamma parameter can result in really bad classification.

[1] https://github.com/ssamot/causality/blob/master/matlab-code/Code/mfunc/indep/HSIC/rbf_dot.m

[2] http://www.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf

[3] http://www.eecs.berkeley.edu/~brecht/papers/08.rah.rec.nips.pdf

[4] https://github.com/aruniyer/misc/blob/master/rks.m