I am hoping to assign each document to one topic using LDA. Now I realise that what you get is a distribution over topics from LDA. However as you see from the last line below I assign it to the most probable topic.
My question is this. I have to run lda[corpus]
for somewhat the second time in order to get these topics. Is there some other builtin gensim function that will give me this topic assignment vectors directly? Especially since the LDA algorithm has passed through the documents it might have saved these topic assignments?
# Get the Dictionary and BoW of the corpus after some stemming/ cleansing
texts = [[stem(word) for word in document.split() if word not in STOPWORDS] for document in cleanDF.text.values]
dictionary = corpora.Dictionary(texts)
dictionary.filter_extremes(no_below=5, no_above=0.9)
corpus = [dictionary.doc2bow(text) for text in texts]
# The actual LDA component
lda = models.LdaMulticore(corpus=corpus, id2word=dictionary, num_topics=30, chunksize=10000, passes=10,workers=4)
# Assign each document to most prevalent topic
lda_topic_assignment = [max(p,key=lambda item: item[1]) for p in lda[corpus]]