I am in process of migrating some of my MapReduce code from Hadoop 1.0 to Hadoop 2.0. I started with the simple WordCount job. I added the following jars to my build path and removed the corresponding Hadoop 1.0 jars.
- hadoop-mapreduce-client-app-2.7.1.jar
- hadoop-mapreduce-client-common-2.7.1.jar
- hadoop-mapreduce-client-core-2.7.1.jar
- hadoop-mapreduce-client-core-2.7.1.jar
- hadoop-mapreduce-client-hs-2.7.1.jar
- hadoop-mapreduce-client-hs-plugins-2.7.1.jar
- hadoop-mapreduce-client-jobclient-2.7.1-tests.jar
- hadoop-mapreduce-client-jobclient-2.7.1.jar
- hadoop-mapreduce-client-shuffle-2.7.1.jar
I retained the simple WordCount class with the following client definition.
JobConf conf = new JobConf(new Configuration(), WordCount.class);
conf.setJobName("WordCount");
conf.setOutputKeyClass(Text.class);
conf.setOutputValueClass(IntWritable.class);
conf.setMapperClass(WordCountMap.class);
//conf.setCombinerClass(Reduce.class);
conf.setReducerClass(WordCountReduce.class);
conf.setInputFormat(TextInputFormat.class);
conf.setOutputFormat(TextOutputFormat.class);
FileInputFormat.setInputPaths(conf, new Path(params.getInput()));
FileOutputFormat.setOutputPath(conf, new Path(params.getOutput()));
JobClient.runJob(conf);
I am getting the following error when instantiating JobConf.
java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at org.apache.catalina.loader.WebappClassLoader.findClassInternal(WebappClassLoader.java:2957)
at org.apache.catalina.loader.WebappClassLoader.findClass(WebappClassLoader.java:1210)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1690)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
at bigdata.hadoop.WordCount.execute(WordCount.java:55)
at bigdata.hadoop.WordCount.execute(WordCount.java:1)
at bigdata.hadoop.BigDataJobDriver.executeJobDriver(BigDataJobDriver.java:15)
at bigdata.jobs.WordCountJob.executeJob(WordCountJob.java:50)
at bigdata.quartz.BigDataJob.execute(BigDataJob.java:30)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
... 14 more
2016-01-23 03:03:35,408 ERROR [QuartzScheduler_Worker-1] core.ErrorLogger (QuartzScheduler.java:schedulerError(2361)) - Job (Hadoop.Hadoop 7 threw an exception.
org.quartz.SchedulerException: Job threw an unhandled exception. [See nested exception: java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration]
at org.quartz.core.JobRunShell.run(JobRunShell.java:224)
at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557)
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at org.apache.catalina.loader.WebappClassLoader.findClassInternal(WebappClassLoader.java:2957)
at org.apache.catalina.loader.WebappClassLoader.findClass(WebappClassLoader.java:1210)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1690)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
at bigdata.hadoop.WordCount.execute(WordCount.java:55)
at bigdata.hadoop.WordCount.execute(WordCount.java:1)
at bigdata.hadoop.BigDataJobDriver.executeJobDriver(BigDataJobDriver.java:15)
at bigdata.jobs.WordCountJob.executeJob(WordCountJob.java:50)
at bigdata.quartz.BigDataJob.execute(BigDataJob.java:30)
at org.quartz.core.JobRunShell.run(JobRunShell.java:213)
... 1 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1720)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1571)
... 14 more
I executed the following command but the command could not find Configuration.class. The same command on hadoop 1.0 showed Configuration.class.
tar tvf hadoop-mapreduce-client-core-2.7.1.jar | grep Configuration.class
Has Configuration.class moved to a different jar file for Hadoop 2.7.1?