I am trying to build doc2vec model, using gensim + sklearn to perform sentiment analysis on short sentences, like comments, tweets, reviews etc.
I downloaded amazon product review data set, twitter sentiment analysis data set and imbd movie review data set.
Then combined these in 3 categories, positive, negative and neutral.
Next I trinaed gensim doc2vec model on the above data so I can obtain the input vectors for the classifying neural net.
And used sklearn LinearReggression model to predict on my test data, which is about 10% from each of the above three data sets.
Unfortunately the results were not good as I expected. Most of the tutorials out there seem to focus only on one specific task, 'classify amazon reviews only' or 'twitter sentiments only', I couldn't manage to find anything that is more general purpose.
Can some one share his/her thought on this?