Hi I was trying to run the example from Mahout in action Chapter 7(k-Mean Clustering). Can somebody guide me how to run that example in a Hadoop Cluster(single Node CDH-4.2.1) with Mahout(0.7)
These are the steps i followed:
Copied the code( from Github)into my Eclipse IDE, on my local machine.
Incuded these jars into my Eclipse project.
hadoop-common-2.0.0-cdh4.2.1.jar
hadoop-hdfs-2.0.0-cdh4.2.1.jar
hadoop-mapreduce-client-core-2.0.0-cdh4.2.1.jar
mahout-core-0.7-cdh4.3.0.jar
mahout-core-0.7-cdh4.3.0-job.jar
mahout-math-0.7-cdh4.3.0.jar
Made a Jar of this project and copied that jar onto my Hadoop Cluster
Executed this command
user@INFPH01463U:~$ hadoop jar /home/user/apurv/Kmean.jar tryout.SimpleKMeansClustering
which gave me following Error
Exception in thread "main" java.lang.NoClassDefFoundError: FileSystem
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2427)
at java.lang.Class.getMethod0(Class.java:2670)
at java.lang.Class.getMethod(Class.java:1603)
at org.apache.hadoop.util.RunJar.main(RunJar.java:202)
Caused by: java.lang.ClassNotFoundException: FileSystem
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 5 more
Can anyone help me with what i'm missing or is my way of execution wrong?
Secondly i would like to know how can i run K-mean Clustering on a CSV file??
Thanks In Advance :)