I'm not familiar with any reasoning or research precedent which implies that either unit-normalized or non-normalized document-vectors are better for clustering.
So, I'd try both to see which seems to work better for your purposes.
Other thoughts:
In Word2Vec
, my general impression is that larger-magnitude word-vectors are associated with words that, in the training data, have more unambiguous meaning. (That is, they reliably tend to imply the same smaller set of neighboring words.) Meanwhile, words with multiple meanings (polysemy) and usage amongst many other diverse words tend to have lower-magnitude vectors.
Still, the common way of comparing such vectors, cosine-similarity, is oblivious to magnitudes. That's likely because most comparisons just need the best sense of a word, without any more subtle indicator of "unity of meaning".
A similar effect might be present in Doc2Vec
vectors: lower-magnitude doc-vectors could be a hint that the document has more broad word-usage/subject-matter, while higher-magnitude doc-vectors suggest more focused documents. (I'd similarly have the hunch that longer documents may tend to have lower-magnitude doc-vectors, because they use a greater diversity of words, whereas small documents with a narrow set of words/topics may have higher-magnitude doc-vectors. But I have not specifically observed/tested this hunch, and any effect here could be heavily influenced by other training choices, like the number of training iterations.)
Thus, it's possible that the non-normalized vectors would be interesting for some clustering goals, like separating focused documents from more general documents. So again, after this longer analysis: I'd suggest trying it both ways to see if one or the other seems to work better for your specific needs.