I am trying to submit a Spark Job from a shell script.
Its a simple script with just spark-submit command. I am trying to give an argument to my Main function through the spark-submit command, But when i try executing the shell file the job is failing with Error :
scala.MatchError: rma (of class java.lang.String)
As i have used match-case in my code.
This is the content of my Shell Script
#adsName=$1
spark-submit --class TestQuery --master yarn --deploy-mode cluster \
--driver-memory 12G --executor-memory 8G --executor-cores 4 \
--num-executors 100 --files /opt/mapr/spark/spark-2.1.0/conf/hive-site.xml \
--jars /users/myuser/config-1.2.0.jar \
/users/myuser/jars/adsoptimization_2.11-0.1.jar \
xyz
So 'xyz' is the string i am passing in the command. currently i have hard-coded it still its not working i wanted to pass this dynamically as an argument to the shell file.
My code in the main function:
args(0) match {
case "str1" => TestQuery(spark).runstr1
case "xyz" => TestQuery(spark).runxyz
case "str2" => TestQuery(spark).runstr2
case "str3" => TestQuery(spark).runstr3
}
so the 'xyz' string that i am passing will come in args(0) (and then i am calling the function defined in my case class by passing the spark session object as the args)
So here the ask is how to simply make the spark job run via shell script
args.foreach(println)before the pattern matching expression? You'll know what is passed asargs. Also, start your shell script withSPARK_PRINT_LAUNCH_COMMAND=1to see what exactlyspark-submitexecutes. That should give you enough to hunt down the root cause. - Jacek Laskowski