0
votes

I have an scala object file which internally queries mysql table does a join and writes data to s3, tested my code in local it runs perfectly fine. but when i submit it to cluster it throws below error:

Exception in thread "main" java.sql.SQLException: No suitable driver at java.sql.DriverManager.getDriver(DriverManager.java:315) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:53) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:123) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:117) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:53) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122) at QuaterlyAudit$.main(QuaterlyAudit.scala:51) at QuaterlyAudit.main(QuaterlyAudit.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

below is my sparksubmit command:

nohup spark-submit --class QuaterlyAudit --master yarn-client --num-executors 8 
--driver-memory 16g --executor-memory 20g --executor-cores 10 /mypath/campaign.jar &

i am using sbt, i included mysql connector in sbt assembly, Below is my build.sbt file:

name := "mobilewalla"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies ++= Seq("org.apache.spark" %% "spark-core" % "2.0.0" % "provided",
  "org.apache.spark" %% "spark-sql" % "2.0.0" % "provided",
  "org.apache.hadoop" % "hadoop-aws" % "2.6.0" intransitive(),
  "mysql" % "mysql-connector-java" % "5.1.37")

assemblyMergeStrategy in assembly := {
  case PathList("META-INF", xs@_*) =>
    xs.map(_.toLowerCase) match {
      case ("manifest.mf" :: Nil) |
       ("index.list" :: Nil) |
       ("dependencies" :: Nil) |
       ("license" :: Nil) |
       ("notice" :: Nil) => MergeStrategy.discard
  case _ => MergeStrategy.first // was 'discard' previousely
}
  case "reference.conf" => MergeStrategy.concat
  case _ => MergeStrategy.first
}
assemblyJarName in assembly := "campaign.jar"

i also tried with:

nohup spark-submit --driver-class-path /mypath/mysql-connector-java-5.1.37.jar 
--class QuaterlyAudit --master yarn-client --num-executors 8 --driver-memory   16g 
--executor-memory 20g --executor-cores 10 /mypath/campaign.jar &

but still no luck, what am i missing here.

2

2 Answers

0
votes

It is obvious reason the Spark is not able to get the JDBC JAR. There are few work around by which it can be fixed. No doubt many people faced this issue . It is due to Jar is not getting uploaded to driver and executor.

  1. You might want to assembly you application with your build manager (Maven,SBT) thus you'll not need to add the dependecies in your spark-submit cli.
  2. You can use the following option in your spark-submit cli : --jars $(echo ./lib/*.jar | tr ' ' ',')
  3. You can also try to configure these 2 variables : spark.driver.extraClassPath and spark.executor.extraClassPath in SPARK_HOME/conf/spark-default.conf file and specify the value of these variables as the path of the jar file. Ensure that the same path exists on worker nodes.
0
votes

You have to specify packages like this:

spark-submit --packages org.apache.spark:spark-avro_2.11:2.4.4,mysql:mysql-connector-java:5.1.6 your-jar.jar