0
votes

I am trying to perform map reduce on Pentaho 5. For Pentaho 5, the Pentaho applications come pre-configured for Apache Hadoop 0.20.2 and it says no further configuration is required for this version. I installed Hadoop 0.20.2 on windows using cygwin and every thing works fine. I run a simple job on Pentaho, which copies file in HDFS which finished successfully and the file system was copied into HDFS. But as soon as I run map reduce job, it says the job was finished on pentaho but the map reduce task was failed and on the output directory on HDFS the result is missing and the log file says:

Error: java.lang.ClassNotFoundException: org.pentaho.di.trans.step.RowListener at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:762) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:807) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:833) at org.apache.hadoop.mapred.JobConf.getMapRunnerClass(JobConf.java:790) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170

Please Help me out.

1
Have anybody solved this issue ?Sudo

1 Answers

0
votes

Maybe a little bit old but I thought it might help someone.

This can be caused by:

  • For Hadoop version > 0.20: check if you set up your environment correctly, see Pentaho Support : Creating a New Hadoop Configuration
  • Check on HDFS if you have an /opt/pentaho/mapreduce/ and check folder permissions on HDFS for /opt/pentaho (Did you find the kettle-*.jar files in lib folder?)
  • Check classpath separator ("," in Windows, ":" in Linux). In order to change it, edit spoon.sh (or spoon.bat) and modify OPT variable like this: OPT="$OPT -Dhadoop.cluster.path.separator=,"