3
votes

I have a running Spark 2.3.1 cluster hosted in https:azuredatabricks.net, I have created a database with a permanent table and I have been able to run queries through the Notebook web interface.
Now I am looking for a way to query the same cluster from a .Net console application and I am lost.

1. Is there Rest API that can be used to perform SQL/Python queries?
2. How to configure ODBC connection string to connect to the cluster and what are the working ODBC drivers out there?

Eventually I am looking for a way to enable users to run one of several predefined parametrized queries against the Spark cluster through a Web App/REST service written using JavaScript or .Net code.

1
Hey @ViktorZ, how did you finally do this?asds_asds

1 Answers

4
votes

To the best of my knowledge, there is not currently a way to query Databricks tables outside of the Databricks workspace.

Depending on what you are attempting to accomplish, you could leverage the REST API to execute a job (Notebook or JAR) that executes your parameterized queries. This is described in the Databricks REST API documentation (https://docs.azuredatabricks.net/api/latest/jobs.html#run-now). If you need the results of the queries in your .NET application, your options are going to be limited, and your best bet is probably to write the results of the query to a file in Data Lake Storage or Blob Storage, and then read from there with your console application. You could pass the name of the file in as a parameter from the console application, so you can easily retrieve it after execution completes.

To connect to the cluster from .NET, you would need to use a Databricks Access Token and the Authentication REST API (https://docs.azuredatabricks.net/api/latest/authentication.html).