3
votes

i am very new to matlab and SVM and i am reproducing this experiment from strach http://bioinformatics.oxfordjournals.org/content/19/13/1656.full.pdf

*There they say "In the training of SVMs, we use the method of one versus the others, or one versus the rest" . ok there are 12 classes so they produce 12 SVMs. they train each one with positives vs all rest.

*but then they say "The prediction performance was examined by the 5-fold cross-validation test"

My noob question is, How can they make a k-fold cross validation after! they train SVMs. what i think(likely faulty) is when you make a k-fold cross validation, you construct a new svm from beginning. they may be similar but svm model is different in every loop. there are k different svm models. but, if they train svm models beforehand, How can they make a cross validation test? What am i missing? Please help and thank you very much

1

1 Answers

1
votes

First they produce cross-validated datasets. Then they train 5 models (one for each fold) and repeatedly train-test. You can do this as follows:

% I assume use of LIBSVM for svm training-testing in this code snippet
% Create a random data
data=1+2*randn(1000,10);
labels=randi(12,[1000,1]);

%do 5-fold cross-validation
ind=crossvalind('Kfold',labels,5);

for i=1:5
    % (4/5)^th random data for training
    trainingData=data(ind~=5,:); %notice ~=
    trainingLabels=labels(ind~=5); 

    % (1/5)^th random data for testing
    testingData=data(ind==5,:); %notice ==
    testingLabels=labels(ind==5); 

   % train svm
    model{i,1}=svmtrain(trainingLabels,trainingData);

   %test svm
   [predLabels{i,1},accuracy(i,1)]=svmpredict(testingLabels,testingData,model{i,1});
end

% I think this is what they mean when they say, we analyze the performance
% using 5 -fold cross validation

% following two things is what you will report
plot(accuracy);  %how accuracy varies over random selection of data
avgAccuracy=mean(accuracy); %what is the average accuracy over 5 runs?