It's fairly easy to create doc vectors and cluster using apache mahout . Executing a clusterdump allows the user to view the terms associated with the individual clusters. However, how can I identify the documents that belong to each cluster ?
Thanks