I am trying to perform a two-class classification using SVM in MATLAB. The two classes are 'Normal' and 'Infected' for classifying cell images into Normal or Infected respectively.
I use a training set which consists of 1000 Normal cell images and 300 Infected cell images. I extract 72 features from each of these cells. So my training feature set matrix is 72x1300 where each row represents a features and each column represents the corresponding feature value measured from the corresponding image.data: 72x1300 double
My class label vector is initialized as:
cellLabel(1:1000) = {'normal'};
cellLabel(1001:1300) = {'infected'};
As suusgested in these links: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf and svm scaling input values, I set about scaling the feature values doing this:
for i=1:1:size(data,1)
mu(i) = mean(data(:,i));
sd(i) = std(data(:,i));
scaledData(:,i) = (data(:,i) - mu(i))./sd(i);
end
For testing, I read a test image and compute a 72x1 feature vector. Before I classify, I scale the test vector using the corresponding mean and standard deviation values from the `data' and then classify. If I do this, I am getting a 0% training accuracy. However, if I scale each from each class separately and concatenate, I am getting a 98% training accuracy. Can someone explain if my method is correct? For training accuracy, I knew what image I was using and hence read the mean and SD value. How should I do it for a case where the image's label is unknown?
This is how I train:
[idx,z] = rankfeatures(data,cellLabel,'Criterion','wilcoxon','NUMBER',7);
rnkData = data(idx,:);
rnkData = rnkData';
cellLabel = cellLabel';
SVMModel = fitcsvm(rnkData,cellLabel,'Standardize',true,'KernelFunction','RBF','KernelScale','auto');
You can see I tried using the in-built scaling property but the classification tends to show 'normal' class irrespective of the input.