0
votes

I am in process of migrating some of my MapReduce code from Hadoop 1.0 to Hadoop 2.0. I started with the simple WordCount job. I added the following jars to my build path and removed the corresponding Hadoop 1.0 jars.

  • hadoop-mapreduce-client-app-2.7.1.jar
  • hadoop-mapreduce-client-common-2.7.1.jar
  • hadoop-mapreduce-client-core-2.7.1.jar
  • hadoop-mapreduce-client-core-2.7.1.jar
  • hadoop-mapreduce-client-hs-2.7.1.jar
  • hadoop-mapreduce-client-hs-plugins-2.7.1.jar
  • hadoop-mapreduce-client-jobclient-2.7.1-tests.jar
  • hadoop-mapreduce-client-jobclient-2.7.1.jar
  • hadoop-mapreduce-client-shuffle-2.7.1.jar

I retained the simple WordCount class with the following client definition.

JobConf conf = new JobConf(new Configuration(), WordCount.class);
conf.setJobName("WordCount");

conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);

conf.setMapperClass(WordCountMap.class);
//conf.setCombinerClass(Reduce.class);
conf.setReducerClass(WordCountReduce.class);

conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);

FileInputFormat.setInputPaths(conf, new Path(params.getInput()));
FileOutputFormat.setOutputPath(conf, new Path(params.getOutput()));

JobClient.runJob(conf);

I am getting the following error when instantiating JobConf.

java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at org.apache.catalina.loader.WebappClassLoader.findClassInternal(WebappClassLoader.java:2957)
at org.apache.catalina.loader.WebappClassLoader.findClass(WebappClassLoader.java:1210)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1690)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
at bigdata.hadoop.WordCount.execute(WordCount.java:55)
at bigdata.hadoop.WordCount.execute(WordCount.java:1)
at bigdata.hadoop.BigDataJobDriver.executeJobDriver(BigDataJobDriver.java:15)
at bigdata.jobs.WordCountJob.executeJob(WordCountJob.java:50)
at bigdata.quartz.BigDataJob.execute(BigDataJob.java:30)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
... 14 more
2016-01-23 03:03:35,408 ERROR [QuartzScheduler_Worker-1] core.ErrorLogger (QuartzScheduler.java:schedulerError(2361)) - Job (Hadoop.Hadoop 7 threw an exception.
org.quartz.SchedulerException: Job threw an unhandled exception. [See nested exception: java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration]
at org.quartz.core.JobRunShell.run(JobRunShell.java:224)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at org.apache.catalina.loader.WebappClassLoader.findClassInternal(WebappClassLoader.java:2957)
at org.apache.catalina.loader.WebappClassLoader.findClass(WebappClassLoader.java:1210)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1690)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
at bigdata.hadoop.WordCount.execute(WordCount.java:55)
at bigdata.hadoop.WordCount.execute(WordCount.java:1)
at bigdata.hadoop.BigDataJobDriver.executeJobDriver(BigDataJobDriver.java:15)
at bigdata.jobs.WordCountJob.executeJob(WordCountJob.java:50)
at bigdata.quartz.BigDataJob.execute(BigDataJob.java:30)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
... 1 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
... 14 more

I executed the following command but the command could not find Configuration.class. The same command on hadoop 1.0 showed Configuration.class.

tar tvf hadoop-mapreduce-client-core-2.7.1.jar | grep Configuration.class

Has Configuration.class moved to a different jar file for Hadoop 2.7.1?

2

2 Answers

0
votes

Check by making object of configuration and pass it to the Job object.

like: Configuration conf = new Configuration(); Job job = new Job(conf, "WordCount");

0
votes

Try replacing with this

Configuration conf = new Configuration();
Job job = new Job(conf, "wordcount");
job.setJarByClass(WordCount.class);

job.setOutputKeyClass(Text.class);
job.setOutputValueClass(IntWritable.class);

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);

job.setMapperClass(WordCountMap.class);
job.setReducerClass(WordCountReduce.class);

job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);

FileInputFormat.addInputPath(job, new Path(args[0]));
FileOutputFormat.setOutputPath(job, new Path(args[1]));

job.waitForCompletion(true);