0
votes

I created an sbt project using intelliJ. I copied the required jdbc jar sqljdbc42.jar in the lib folder of the project. sbt package finished successfully. I started the spark by spark-shell --driver-class-path C:\sqljdbc_6.0\enu\jre8\sqljdbc42.jar on Windows.

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import java.sql._

object ConnTest extends App {
  val conf = new SparkConf()
  val sc = new SparkContext(conf.setAppName("Test").setMaster("local[*]"))

  // The following four statements work if running interactively in the Spark shell
  val sqlContext = new org.apache.spark.sql.SQLContext(sc)
  val jdbcSqlConn = "jdbc:sqlserver://...;databaseName=...;user=...;password=...;"
  val jdbcDf = sqlContext.read.format("jdbc").options(Map(
        "url" -> jdbcSqlConn,
        "dbtable" -> "testTable"
      )).load()
  jdbcDf.show(10)

  sc.stop()
}

However, the following spark-submit commands got the errors.

spark-submit.cmd --class ConnTest --master local[4] .\target\scala-2.11\test_2.11-1.0.jar
spark-submit.cmd --class ConnTest --master local[4] .\target\scala-2.11\test_2.11-1.0.jar --jars \sqljdbc_6.0\enu\jre8\sqljdbc42.jar
Exception in thread "main" java.sql.SQLException: No suitable driver
        at java.sql.DriverManager.getDriver(Unknown Source)
        at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84)
        at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:83)
        at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:34)
        at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:32)
        at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125)
        at ConnTest$.delayedEndpoint$ConnTest$1(main.scala:14)
        at ConnTest$delayedInit$body.apply(main.scala:6)
        at scala.Function0$class.apply$mcV$sp(Function0.scala:34)
        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
        at scala.App$$anonfun$main$1.apply(App.scala:76)
        at scala.App$$anonfun$main$1.apply(App.scala:76)
        at scala.collection.immutable.List.foreach(List.scala:381)
        at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
        at scala.App$class.main(App.scala:76)
        at ConnTest$.main(main.scala:6)
        at ConnTest.main(main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

Update: The spark codes work and I can see the table content if I run the statements directly in the Spark shell.

Update 2: It did show the following message when run spark-submit

17/05/15 16:12:30 INFO SparkContext: Added JAR file:/C:/sqljdbc_6.0/enu/jre8/sqljdbc42.jar at spark://10.8.159.130:7587/jars/sqljdbc42.jar with timestamp 1494879150052

2
How about when the --jars ... is put BEFORE the application jar arg? I don't use spark with windows, but in Linux/Mac, the application jar arg comes last. Usage: spark-submit [options] <app jar | python file> [app arguments] where [options] are things like --jars and [app arguments] are passed as args to the actual application (in the "main" function).Garren S
Just tried to move the application jar the last position and it still got the error.ca9163d9
What's returned for sc.getConf.get("spark.jars")? Just use println in your application.Garren S
It returns res3: String = "". I ran it in the spark shell.ca9163d9

2 Answers

2
votes

Setting another option resolved the issue.

"driver" -> "com.microsoft.sqlserver.jdbc.SQLServerDriver",
1
votes

Few options to try:

A. Edit spark-defaults.conf and modify these fields:

spark.driver.extraClassPath /path/to/jar/*

spark.executor.extraClassPath /path/to/jar/*

B. Set path in code:

val conf = new SparkConf() conf.set("spark.driver.extraClassPath", "/path/to/jar/*") val sc = new SparkContext(conf)

C. Try with --jars=local: or --jars "C:\sqljdbc_6.0\enu\jre8\sqljdbc42.jar"

Edit your jar path accordingly as you are running Spark on Windows.

spark-submit.cmd --class ConnTest --master local[4] .\target\scala-2.11\test_2.11-1.0.jar --jars=local:C:\sqljdbc_6.0\enu\jre8\sqljdbc42.jar