Mahout: The output of rowsimilarity process is different in each run of below mentioned steps (keeping all input same for all run)
Step1: seq2sparse (Creating vectors from text) Step2: rowid (generate tfidf vectors) Step3: rowsimilarity (calculate similarity between vectors) Step4: seqdumper (binary vectors to text)
UPDATE:
Thanks Pferrel for the reply,
Kindly suggest how can we specify the "seed value"
The commands which I am using are: ${MAHOUT_HOME}/bin/mahout seq2sparse -i ${DATA}/seq-data -o ${DATA}/vectors -n 2 -wt tfidf -ng 3 -nv -ow -md 100 -s 10
${MAHOUT_HOME}/bin/mahout rowid -i ${DATA}/vectors/tfidf-vectors/part-r-00000 -o ${DATA}/matrix
${MAHOUT_HOME}/bin/mahout rowsimilarity -i ${DATA}/matrix/matrix -o ${DATA}/similarity --similarityClassname SIMILARITY_COSINE -m 100 -ess -ow