I don't understand how word vectors are involved at all in the training process with gensim's doc2vec in DBOW mode (dm=0
). I know that it's disabled by default with dbow_words=0
. But what happens when we set dbow_words
to 1?
In my understanding of DBOW, the context words are predicted directly from the paragraph vectors. So the only parameters of the model are the N
p
-dimensional paragraph vectors plus the parameters of the classifier.
But multiple sources hint that it is possible in DBOW mode to co-train word and doc vectors. For instance:
- section 5 of An Empirical Evaluation of doc2vec with Practical Insights into Document Embedding Generation
- this SO answer: How to use Gensim doc2vec with pre-trained word vectors?
So, how is this done? Any clarification would be much appreciated!
Note: for DM, the paragraph vectors are averaged/concatenated with the word vectors to predict the target words. In that case, it's clear that words vectors are trained simultaneously with document vectors. And there are N*p + M*q + classifier
parameters (where M
is vocab size and q
word vector space dim).