I'm trying to run a custom jar EMR job on Amazon and I have a reference to a lucene jar file in my jar file. I have the jar file in a lib directory on s3 and my Jar arguments look like this:
MyMainClass -libjars s3n://mybucket/lib/lucene-core-3.6.1.jar s3n://mybucket/myinput s3n://mybucket/myoutput
The job fails and I keep getting these errors:
java.lang.NoClassDefFoundError: org/apache/lucene/analysis/Analyzer at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:861) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:906) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:932) at org.apache.hadoop.mapred.JobConf.getMapperClass(JobConf.java:959) at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34) ... 14 more Caused by: java.lang.ClassNotFoundException: org.apache.lucene.analysis.Analyzer at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 21 more
It doesn't seem to find the lucene jar file... What am I missing?
MyMainClassimplements theToolinterface? Also make sure to get the Configuration instance viagetConf()in your run method. - Lorand Bendig