I have created a Naive Bayes classifier that uses the text of tweets from different politicians to predict their party. I used the sklearn MultinomialNB
implementation. Here is my implementation:
Senators_Vectorizer = CountVectorizer(decode_error= 'replace')
senator_counts = Senators_Vectorizer.fit_transform(senator_tweets['text'].values)
senator_targets = senator_tweets['party'].values
senator_counts_train, senator_counts_test, senator_targets_train, senator_targets_test = train_test_split(senator_counts, senator_targets, test_size = .1)
senator_party_clf = MultinomialNB()
senator_party_clf.fit(senator_counts_train, senator_targets_train)
How do I find the words that the Naive Bayes classifier is using to make prediction? Is there a way to find which words have the highest probability of being in Democrats'/Republicans' tweets?
I want the probabilities each word in the Senators_Vectorizer
not the probability of a specific tweet being from a specific party.