0
votes

I'm trying to implement a simple SVM linear binary classification in Matlab but I got strange results.
I have two classes g={-1;1} defined by two predictors varX and varY. In fact, varY is enough to classify the dataset in two distinct classes (about varY=0.38) but I will keep varX as random variable since I will need it to other works.

Using the code bellow (adapted from MAtlab examples) I got a wrong classifier. Linear classifier should be closer to an horizontal line about varY=0.38, as we can perceive by ploting 2D points.
It is not displayed the line that should separate two classes What am I doing wrong?

g(1:14,1)=1;
g(15:26,1)=-1;
m3(:,1)=rand(26,1); %varX
m3(:,2)=[0.4008; 0.3984; 0.4054; 0.4048; 0.4052; 0.4071; 0.4088; 0.4113; 0.4189;
    0.4220; 0.4265; 0.4353; 0.4361; 0.4288; 0.3458; 0.3415; 0.3528; 
    0.3481; 0.3564; 0.3374; 0.3610; 0.3241; 0.3593; 0.3434; 0.3361; 0.3201]; %varY

SVMmodel_testm =  fitcsvm(m3,g,'KernelFunction','Linear');

d = 0.005; % Step size of the grid
[x1Grid,x2Grid] = meshgrid(min(m3(:,1)):d:max(m3(:,1)),...
    min(m3(:,2)):d:max(m3(:,2)));
xGrid = [x1Grid(:),x2Grid(:)];        % The grid
[~,scores2] = predict(SVMmodel_testm,xGrid); % The scores

figure();
h(1:2)=gscatter(m3(:,1), m3(:,2), g,'br','ox');
hold on
    % Support vectors
h(3) = plot(m3(SVMmodel_testm.IsSupportVector,1),m3(SVMmodel_testm.IsSupportVector,2),'ko','MarkerSize',10);
    % Decision boundary
contour(x1Grid,x2Grid,reshape(scores2(:,1),size(x1Grid)),[0 0],'k');
xlabel('varX'); ylabel('varY'); 
set(gca,'Color',[0.5 0.5 0.5]);
hold off
1

1 Answers

0
votes

A common problem with SVM or any classification method for that matter is unnormalized data. You have one dimension that spans for 0 to 1 and the other from about 0.3 to 0.4. This causes inbalance between the features. Common practice is to somehow normalize the features, for examply by std. try this code:

g(1:14,1)=1;
g(15:26,1)=-1;
m3(:,1)=rand(26,1); %varX
m3(:,2)=[0.4008; 0.3984; 0.4054; 0.4048; 0.4052; 0.4071; 0.4088; 0.4113; 0.4189;
    0.4220; 0.4265; 0.4353; 0.4361; 0.4288; 0.3458; 0.3415; 0.3528; 
    0.3481; 0.3564; 0.3374; 0.3610; 0.3241; 0.3593; 0.3434; 0.3361; 0.3201]; %varY
m3(:,2) = m3(:,2)./std(m3(:,2));
SVMmodel_testm =  fitcsvm(m3,g,'KernelFunction','Linear');

Notice the line before the last.