0
votes

Has anyone used SSMS v18.2 or Azure Data Studio to connect to a DataBricks Cluster and so query on DataBricks tables and/or the DataBricks File System (dbfs)?

Would like to know how you can set this up to show a DataBricks server in connections and use PolyBase to connect to dbfs

I can connect to ADLS using the PolyBase commands like as follows:

-- Scoped Credential
CREATE DATABASE SCOPED CREDENTIAL myScopedCredential
WITH
    IDENTITY = '<MyId>@https://login.microsoftonline.com/<Id2>/oauth2/token',
    SECRET = '<MySecret>';

-- External Data Source
CREATE EXTERNAL DATA SOURCE myDataSource
WITH
(
    TYPE = HADOOP,
    LOCATION = 'adl://mydatalakeserver.azuredatalakestore.net',
    CREDENTIAL = myScopedCredential
)

-- Something similar to setup for dbfs?
-- What IDENTITY used for Scoped Credential?
2

2 Answers

0
votes

As per my knowledge, Azure Databrick cannot be connect to SQL Server 2019 using SSMS or Azure Data Studio.

The following list provides the data sources in Azure that you can use with Azure Databricks. For a complete list of data sources that can be used with Azure Databricks, see Data sources for Azure Databricks.

The Spark connector for Microsoft SQL Server and Azure SQL Database enables Microsoft SQL Server and Azure SQL Database to act as input data sources and output data sinks for Spark jobs. It allows you to use real- time transactional data in big data analytics and persist results for ad-hoc queries or reporting.

For more details, refer "Connecting to Microsoft SQL Server and Azure SQL database with Spark connector".

Hope this helps.

0
votes

This doesn't seem possible without the use of 3rd party tools or custom applications. Databricks SQL just doesn't expose the protocols necessary.

There are 3rd party tools (e.g. from CData) that can help you here. See this article: https://www.cdata.com/kb/tech/databricks-odbc-linked-server.rst