3
votes

I want to connect to a Microsoft Azure SQL Server and a Microsoft Azure SQL Database from my Microsoft Azure Databricks Notebook and do a SELECT and INSERT.

Lets assume i have a Microsoft SQL Server named dev-sql-srv reachable hostname is dev-sql-srv.database.windows.net.

Then i have a SQL database named dev-sql-srv-db1. The table is called CUSTOMER for example.

I have found this in the original Azure Databricks documentation at https://docs.azuredatabricks.net/spark/latest/data-sources/sql-databases.html#spark-sql-example.

When i do

%sql
CREATE TABLE CustomerT1
USING org.apache.spark.sql.jdbc
OPTIONS (
  url "jdbc:sqlserver://dev-sql-srv.database.windows.net:1433",
  table "dev-sql-srv-db1.CUSTOMER",
  user "myAdmin",
  password "myPassword"
)

When i execute this in my notebook i am receiving the following error:

Error in SQL statement: IllegalArgumentException: requirement failed: Option 'dbtable' is required. com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: java.lang.IllegalArgumentException: requirement failed: Option 'dbtable' is required. at scala.Predef$.require(Predef.scala:224) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:68) at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.(JDBCOptions.scala:35) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:349) at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:80) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:72) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:81) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:194) at org.apache.spark.sql.Dataset$$anonfun$53.apply(Dataset.scala:3320) at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:88) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:124) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3319) at org.apache.spark.sql.Dataset.(Dataset.scala:194) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:78) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:639) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:707) at com.databricks.backend.daemon.driver.SQLDriverLocal$$anonfun$1.apply(SQLDriverLocal.scala:87)

What is wrong with this example? My parameters all definitely all ok.

2

2 Answers

1
votes

You have to add the databasename to the JDBC Url:

%sql
CREATE TABLE CustomerT1
USING org.apache.spark.sql.jdbc
OPTIONS (
  url "jdbc:sqlserver://dev-sql-srv.database.windows.net:1433;database=dev-sql-srv-db1",
  dbtable "dbo.CUSTOMER",
  user "myAdmin",
  password "myPassword"
)
0
votes

Please replace "table" with "dbtable". For example,

CREATE TEMPORARY TABLE jdbcTable
USING org.apache.spark.sql.jdbc 
OPTIONS ( url "jdbc:sqlserver://xxxxx.databse.windows.net;
DabaseName=yyyydb01;
user=someuser;
password=secret", 
dbtable "SalesLT.Product"
)