I trained two versions of doc2vec models with two datasets.
The first dataset was made with 2400 documents and the second one was made with 3000 documents including the documents which were used in the first dataset.
For an example,
dataset 1 = doc1, doc2, ... doc2400
dataset 2 = doc1, doc2, ... doc2400, doc2401, ... doc3000
I thought that both doc2vec models should return the same similarity score between doc1 and doc2, however, they returned different scores.
Does doc2vec model's result change upon the datasets even they include the same documents?