word_vectorizer = CountVectorizer(ngram_range=(2,2), analyzer='word')
for each in (train_incidents_word_issue["Summary"].index):
text_issue_list = [data_word_issue["Summary"][each]]
sparse_matrix = word_vectorizer.fit_transform(text_issue_list)
frequencies = sum(sparse_matrix).toarray()[0]
bi_grams_issue_df = pd.DataFrame(frequencies, index=word_vectorizer.get_feature_names(), columns=['frequency'])
data_word_issue["data_issue_count"][each] = bi_grams_issue_df[bi_grams_issue_df.index.str.contains("^data issue$")]["frequency"].sum()
I am getting the below error:
ValueError in 5 for each in (train_incidents_word_issue["Summary"].index): 6 text_issue_list = [data_word_issue["Summary"][each]] ----> 7 sparse_matrix = word_vectorizer.fit_transform(text_issue_list) 8 frequencies = sum(sparse_matrix).toarray()[0] 9 bi_grams_issue_df = pd.DataFrame(frequencies, index=word_vectorizer.get_feature_names(), >columns=['frequency'])
ValueError: Empty vocabulary; perhaps the documents only contain stop words>
Help me understand the error and recommended solution...i have just started with python