2
votes

I am trying to use the NaiveBayes function on a training and test set of data. I am using this helpful website: https://rpubs.com/riazakhan94/naive_bayes_classifier_e1071

However, for some reason it is not working and this is error that I am getting:" Error in table(train$Class, trainPred) : all arguments must have the same length. "

Here is the code that I am using, I am guessing its a super simple fix. The x and y columns of the data set are predicting on the class column:

https://github.com/samuelc12359/NaiveBayes.git


test <- read.csv(file="TestX.csv",header=FALSE)
train <- read.csv(file="TrainX.csv",header=FALSE)

Names <- c("x","y","Class")
colnames(test)<- Names
colnames(train)<- Names

NBclassfier=naiveBayes(Class~x+y, data=train)
print(NBclassfier)


trainPred=predict(NBclassfier,train, type="class")
trainTable=table(train$Class, trainPred)
testPred=predict(NBclassfier, newdata=test, type="class")
testTable=table(test$Class, testPred)
print(trainTable)
print(testTable)

1

1 Answers

2
votes

You need to turn the Class column into factors, e.g. like this:

train$Class = factor(train$Class)
test$Class = factor(test$Class)

Then when you call naiveBayes() to train, and later to predict, it will do what you expect.

Alternatively, you can change prediction type to "raw" and turn them into outcomes directly. E.g. like this:

train_predictions = predict(NBclassfier,train, type="raw")
trainPred = 1 * (train_predictions[, 2] >= 0.5 )
trainTable=table(train$Class, trainPred)
test_predictions = predict(NBclassfier, newdata=test, type="raw")
testPred = 1 * (test_predictions[, 2] >= 0.5 )
testTable=table(test$Class, testPred)
print(trainTable)
print(testTable)