2
votes

i could run spark-sql with spark in standalone mode perfectly,but when it comes to yarn mode.spark told me that it cant find the hive class(some basic ones like org/apache/hadoop/hive/ql/plan/TableDesc).

so i added hive libs to compute-classpath.sh. failed. then i thought if yarn dont work and standalone works fine. maybe i should change the yarn classpath to include hive lib.

then i failed again.

i just dont understand that the hive libs occurs in my yarn startup log and spark output, why my hive sql told me the basic hive classes not found?

thanks all for helping me

2
Is hive : - on your executors ? (do you use --jars with spark-submit ?) - on your SparkContext ? (how do you build your SparkConfig ?) - Francois G
hive is on all of my machines. and i add the hive lib dir to compute-classpath.sh, so i think spark-submit will use these jars.I am confused with the SparkConfig build, i just copy the orginal spark conf dir to cdh5.3's spark dir.BTW, the cdh5.3 spark wouldnt work when i first unpacked it, i have to change a few codes in shell scrpits under the bin dir.thank u for ur advice and i will try the spark-submit option. - amow
i just build my spark from source and it worked.why the cdh5.3 tgz failed?thx - amow

2 Answers

1
votes

try this, add spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hive/lib/*

0
votes

You most probably have a classpath issue. Please refer to the 'Classpath issue' section of this troubleshooting guide](http://www.datastax.com/dev/blog/common-spark-troubleshooting). Be careful of setting the --jars option of your spark-submit call and to pass aditionnal jars at the creation of your SparkConfig.