2
votes

I am coding sentiment analysis in R for movie reviews (cornell data set). My train data consists of 1800 rows and 1096 columns(unigrams, bigrams, and trigrams). Test data consists of 200 rows and 1096 columns (features of test data match that of train data). I am trying to train a model using svmRadial from caret package(which in turn uses kernlab package). Training works fine and I am also able to extract important features from the model. But when I try to predict on a new data set, it gives me the following error:

In method$predict(modelFit = modelFit, newdata = newdata, submodels = param)   :
kernlab class prediction calculations failed; returning NAs

I am unable to figure out why my predict fails with this test data. (I tried modeling this using naive bayes from caret. The training and testing work perfectly.) Adding my code below for reference. Would really appreciate any help! Thanks a lot!

sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

//Training svm model
svm.model <- train(as.factor(label) ~., 
                data = dtm, 
                method = "svmRadial",
                preProc = c("center", "scale"),
                trControl = trainControl(method = "cv", number = 5),
                tuneLength = 8)

//prediction on test data - gives warning and returns "NA"s
svm.pred <- predict(svm.model, testData)
1
check if you do not have any missing data in your dataframe.phiver

1 Answers

0
votes

Change line

trControl = trainControl(method = "cv", number = 5)

in

trControl = trainControl(method = "cv", number = 5, classProbs =  TRUE)