3
votes

I m using SVM for classification, I have devided my data set into two CSV file one is training set (70 % of data) and other is testing set (30 % of data). when i use predict on the trainig set i m getting answer but on testing set it shows error I m using e1071 package

program as follow

Train <- read.csv("Train.csv")
Test <- read.csv("Test.csv")

x_Train <- subset(Train,select=-Class)
y_Train <- Train$Class

model <- svm(Class ~., data=Train)


pred=predict(model, x_Train) #working well
table(pred,y_Train)  


 x_Test <- subset(Test,select=-Class)
 y_Test <- Test$Class

pred <- predict(model, x_Test) #getting_error

Error in scale.default(newdata[, object$scaled, drop = FALSE], center =         object$x.scale$"scaled:center",  : 
length of 'center' must equal the number of columns of 'x'

Will you please figure out wat could be the problem...?

4
In your example, you created x_Test object, but you predicted on x_test object. Capital letters make differences.Jot eN
with the changes also i m getting an error as: Error in scale.default(newdata[, object$scaled, drop = FALSE], center = object$x.scale$"scaled:center", : length of 'center' must equal the number of columns of 'x'Sumit Waghmare
Without the data is hard to find out what's wrong.Jot eN

4 Answers

2
votes

Ok, for those of you who had this error but none of these solutions worked like me: What I did was to increase the size of the test data marginally and it worked like a charm. The first time I had the error, I split the 2 sets 80-20, tried doing it 75-25 and worked just fine. I can't be sure why, but it worked.

1
votes

Remove the missing data in your test data or add na.action = na.omit in your prediction model. or you can use na.action = na.exclude

model <- svm(Class ~., data=Train, na.action = na.exclude)
1
votes

If the class of a predictor in the train set is not the same as class of that same variable in the test set then you will run into this issue.

For example, if you trained a model with predictor variable x with class(x) = numeric and in the test set class(x) = character then you should convert x to numeric before predicting:

data$x <- as.numeric(data$x)

That being said, it could be any class not strictly character or numeric, it could also be a factor variable.

1
votes

This is because the output has scale variables and those scale variables don't match the "newdata" variables.

Assume that you trained the SVM model for 5 variables called PC2: PC6

svm_model$x.scale
$`scaled:center`
          PC2           PC3           PC4           PC5           PC6           
 5.445380e-16  2.507442e-16 -7.655441e-16 -5.730488e-16 -3.283584e-16 

$`scaled:scale`
      PC2       PC3       PC4       PC5       PC6       
17.774403 13.571134  7.911114  6.541206  3.608903  

In your newdata if the length of variables is >5 , you'll get this error. In your case x_Test <- subset(Test,select=-Class) most likely changes the number of variables to scale.