I tried to follow many online tutorials to run kmeans example present in Mahout. But did not succeed yet to get meaningful output. The main problem I am facing is, the conversion from text file to sequencefile and back.
When I followed the steps of "Mahout Wiki" for "Clustering of synthetic control data" (https://cwiki.apache.org/MAHOUT/clustering-of-synthetic-control-data.html) I could run the clustering process (using $MAHOUT_HOME/bin/mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job) and that created some readable console output. But I wish to get output files (as the size is large) from the clustering process. The output files which were generated by Mahout clustering are all sequence file and I cant convert them to readable files. When I tried to do "clusterdump" ($MAHOUT_HOME/bin/mahout clusterdump --seqFileDir output/clusters-10...) I got errors. First it complains that "seqFileDir" option is unexpected and I guess either there is no "seqFileDir" for clusterdump or I am missing something.
Trying to use Mahout in the way of "mahout in action" seems tricky. I am not sure what are the required classes ("import ??") to compile that code.
Can you please suggest me the steps to successfully RUN kmeans on Mahout ? Specially how to get readable output from sequence files ?