0
votes

I can't add custom dependency to the spark classpath from zeppelin.

Environment: AWS EMR: Zeppelin 0.8.0, Spark 2.4.0

extra configs for spark interpreter:

spark.jars.ivySettings  /tmp/ivy-settings.xml
spark.jars.packages my-group-name:artifact_2.11:version

The files from my-group-name were appeared at

spark.yarn.dist.jars
spark.yarn.secondary.jars

But not accessible via zeppelin notebook (checking by import my.lab._)

However, when i am running the same configs for spark-shell it is working on both local machine, and ssh on emr cluster and imports are available from spark-shell

Sun.java.command for zeppelin:

org.apache.spark.deploy.SparkSubmit --master yarn-client ... --conf spark.jars.packages=my-group-name:artifact_2.11:version ... --conf spark.jars.ivySettings=/tmp/ivy-settings.xml ... --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer /usr/lib/zeppelin/interpreter/spark/spark-interpreter-0.8.0.jar <IP ADDRESS> 34717 :

Spark submit on emr:

spark-shell --master yarn-client --conf spark.jars.ivySettings="/tmp/ivy-settings.xml" --conf spark.jars.packages="my-group-name:artifact_2.11:version"

Any advices where to look for the errors?

1

1 Answers

0
votes

You can try to add your jar directly to Zeppelin, in Interpreter settings. http://zeppelin.apache.org/docs/0.8.0/usage/interpreter/dependency_management.html

Or, add jar to spark libs (in my case it's /usr/hdp/current/spark2/jars/ directory).