0
votes

I've been trying to figure out how execute my Map/Reduce job for almost 2 days now. I keep getting a ClassNotFound exception.

I've installed a Hadoop cluster in Ubuntu using Cloudera CDH4.3.0. The .java file (DemoJob.java which is not inside any package) is inside a folder called inputs and all required jar files are inside inputs/lib.

I followed http://www.cloudera.com/content/cloudera-content/cloudera-docs/HadoopTutorial/CDH4/Hadoop-Tutorial/ht_topic_5_2.html for reference.

  1. I compile the .java file using:

    javac -cp "inputs/lib/hadoop-common.jar:inputs/lib/hadoop-map-reduce-core.jar" -d Demo inputs/DemoJob.java 
    

    (In the link, it says -cp should be "/usr/lib/hadoop/:/usr/lib/hadoop/client-0.20/". But I don't have those folders in my system at all)

  2. Create jar file using:

    jar cvf Demo.jar Demo
    
  3. Move 2 input files to HDFS (Now this is where I'm confused. Do I need to move the jar file to HDFS as well? It doesn't say so in the link. But if it is not in HDFS, then how does the hadoop jar .. command work? I mean how does it combine the jar file which is in Linux system and the input files which are in HDFS?)

  4. I run my code using:

    hadoop jar Demo.jar DemoJob /Inputs/Text1.txt /Inputs/Text2.txt /Outputs
    

I keep getting ClassNotFoundException : DemoJob.

Somebody please help.

2

2 Answers

0
votes

The class not found exception only means that some class wasn't found when class DemoJob was loaded. The missing class could have been a class referenced (imported, for example) by DemoJob. I think the problem is that you don't have the /usr/lib/hadoop/:/usr/lib/hadoop/client-0.20/ folders (classes) in your class path. It's the classes that should be there but aren't that probably are triggering the class not found exception.

0
votes

Finally figured out what the problem was. Instead of creating a jar file from a folder, I directly created the jar file from the .class files using
jar -cvf Demo.jar *.class

This resolved the ClassNotFound error. But I don't understand why it was not working earlier. Even when I created the jar file from a folder, I did mention the folder name when executing the class file as:
hadoop jar Demo.jar Demo.DemoJob /Inputs/Text1.txt /Inputs/Text2.txt /Outputs