0
votes

I have implemented the KNN classifier in java and I got a strange result. If I do a sentiment analysis on a dataset example amazon books review I got 55% precision. From 100 test document 55 correctly classified as negative or positive review and 45 incorrectly. But If I use the KNN for category classification example camera or books then I got 95% precision.

There are some explanation my code is wrong? Any idea?

1
Apples to Oranges? Are you comparing KNN's performance at sentiment analysis and KNN's performance on Categorization? You'd be using radically different features in those cases....it's not the algorithm's fault if those aren't working well....Chris Pfohl
@Christopher Pfohl yes, I am comparing KNN performance in categorization and sentiment analysis. What do you mean radically different features? I have used stemming and stopwords.flatronka
thanks @gary, but I need just some theory, my code is huge more than 15 classes, interfaces, I need some theory that it is possible or not.flatronka
Any Machine learning task is highly dependent on the data used, and the features used. Categorizing and Sentiment analysis are different tasks, so different features will be needed.Chris Pfohl

1 Answers

3
votes

@Christopher Pfohl is right. They are different approaches with one key difference for you. Sentiment analysis (based on simple Bag of Words) is much more complicated, in general, than category classification in your case.

Btw, just one clarification, 55% is not precision, that is accuracy. (More info: http://en.wikipedia.org/wiki/Accuracy_and_precision#In_binary_classification)