run spark sql on yarn with cdh5.3 class not found

Question

i could run spark-sql with spark in standalone mode perfectly,but when it comes to yarn mode.spark told me that it cant find the hive class(some basic ones like org/apache/hadoop/hive/ql/plan/TableDesc).

so i added hive libs to compute-classpath.sh. failed. then i thought if yarn dont work and standalone works fine. maybe i should change the yarn classpath to include hive lib.

then i failed again.

i just dont understand that the hive libs occurs in my yarn startup log and spark output, why my hive sql told me the basic hive classes not found?

thanks all for helping me

Is hive : - on your executors ? (do you use --jars with spark-submit ?) - on your SparkContext ? (how do you build your SparkConfig ?) — Francois G
hive is on all of my machines. and i add the hive lib dir to compute-classpath.sh, so i think spark-submit will use these jars.I am confused with the SparkConfig build, i just copy the orginal spark conf dir to cdh5.3's spark dir.BTW, the cdh5.3 spark wouldnt work when i first unpacked it, i have to change a few codes in shell scrpits under the bin dir.thank u for ur advice and i will try the spark-submit option. — amow
i just build my spark from source and it worked.why the cdh5.3 tgz failed?thx — amow

march march · Accepted Answer · 2015-03-18T00:33:48

1

votes

try this, add spark.executor.extraClassPath=/opt/cloudera/parcels/CDH/lib/hive/lib/*

run spark sql on yarn with cdh5.3 class not found

2 Answers