I was trying to reproduce an example shown in the libsvm "A Practical Guide to Support Vector Classification" on Page 10. The data "train.2" that I was using can be downloaded here "http://www.csie.ntu.edu.tw/~cjlin/papers/guide/data/".
In order to parse the data and test the classification accuracy, I wrote the following code:
library(e1071)
rm(list=ls(all=T))
root <- "C:/Users/administrator/Documents/RProjects/libsvm"
bioDataFile <- sprintf("%s/data/train.2", root)
bioData <- read.delim(bioDataFile, header=F, sep=" ", stringsAsFactors=F)
bioData <- bioData[, c(-2,-3,-ncol(bioData))]
bioData <- lapply(1:nrow(bioData), function(n){
reformData <- bioData[n,-1,drop=F]
reformData <- sapply(1:ncol(reformData), function(m){
as.numeric(unlist(strsplit(reformData[,m], ":"))[2])
})
data.frame(Type=factor(bioData[n,1]), t(reformData))
})
bioData <- do.call("rbind", bioData)
Then I performed the test:
bioData.model <- svm(Type~., data=bioData, cross=5)
However, I found that: 1. I couldn't get the same results as shown in the manual; 2. I found that the k-fold cross-validation accuracy (either mean(bioData.model$accuracies) or bioData.model$tot.accuracy) is different each time I run the command.
I did the same test using the svm-train.exe provided in the libsvm package, it did produce the same results as shown in the manual, and no matter how many times I ran the test, it always gives me the same k-fold cross-validation accuracies.
Can anyone tell me why? Any help would be much appreciated.