1
votes

I have a simple spark application which is reading csv data and then writing to avro .This application is working fine while submitting as spark-submit command line but failing with below error when trying to execute from oozie spark action . Error message:


Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.SparkMain], main() threw exception, net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V
java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V
    at org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122)
    at org.apache.spark.sql.execution.SparkPlan.org$apache$spark$sql$execution$SparkPlan$$decodeUnsafeRows(SparkPlan.scala:274)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeTake$1.apply(SparkPlan.scala:366)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeTake$1.apply(SparkPlan.scala:366)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
    at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
    at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)

Oozie details :

job.properties 
nameNode=NAMEMODE:8020
jobTracker=JT:8032
queueName=default
oozie.use.system.libpath=true
oozie.wf.application.path=${nameNode}/user/oozie/spark/

workflow.xml

<workflow-app name="sample-wf" xmlns="uri:oozie:workflow:0.1">
    <start to="sparkAction" />
    <action name="sparkAction">
        <spark xmlns="uri:oozie:spark-action:0.1">
            <job-tracker>${jobTracker}</job-tracker>
            <name-node>${nameNode}</name-node>
            <configuration>
            <property>
                        <name>oozie.launcher.mapreduce.map.memory.mb</name>
                        <value>1024</value>
                        </property>
                                <property>
                                    <name>oozie.launcher.mapreduce.map.java.opts</name>
                                    <value>-Xmx777m</value>
                                </property>
                                <property>
                                  <name>oozie.launcher.yarn.app.mapreduce.am.resource.mb</name>
                                         <value>2048</value>
</property>
<property>
    <name>oozie.launcher.mapreduce.map.java.opts</name>
    <value>-Xmx1111m</value>
</property>
            </configuration>
            <master>yarn</master>
            <mode>client</mode>
            <name>tssETL</name>
            <class>com.sc.eni.main.tssStart</class>
            <jar>${nameNode}/user/oozie/spark/tss-assembly-1.0.jar</jar>
            <spark-opts>--driver-memory 512m --executor-memory 512m --num-executors 1 </spark-opts>
            </spark>
        <ok to="end"/>
        <error to="fail"/>
    </action>
        <kill name="fail">
          <message>Workflow failed, error
            message[${wf:errorMessage(wf:lastErrorNode())}] </message>
        </kill>
        <end name="end" />
</workflow-app>

In job tracker the MAP Reduce job is coming as Succeded as its calling Spark Action and failing there but overall Oozie is failing.

Veriosn Used

EMR Cluster: emr-5.13.0
Spark : 2.3
Scala 2.11

I also checked the oozie share lib in hdfs : /user/oozie/share/lib/lib_20180517102659/spark and it contains lz4-1.3.0.jar which has the class net.jpountz.lz4.LZ4BlockInputStream mentioned in error.

Any help would be really appreciated as I am struggeling for quite a long time on this.

Many Thanks

1

1 Answers

0
votes

Oozie gives

java.lang.NoSuchMethodError

when one library is available through more than one ways, so creating conflict. Since you have specified

oozie.use.system.libpath=true

so all of the Oozie spark shared libraries are available to it and all jars mentioned in build build.sbt are also available.

To resolve this please check which dependencies you have mentioned in your build.sbt are present in oozie spark shared libraries folder also and then add "% provided" in those dependencies which will remove them from assembly jar and hence there will be no conflict of jars.