Azure Databricks - Unable to read simple blob storage file from notebook

Question

I've set up a cluster with databricks runtime version 5.1 (includes Apache Spark 2.4.0, Scala 2.11) and Python 3. I also installed hadoop azure library (hadoop-azure-3.2.0) to the cluster.

I'm trying to read a blob stored in my blob storage account which is just a text file containing some numeric data delimited by spaces for example. I used the template generated by databricks for reading blob data

    spark.conf.set(
      "fs.azure.account.key."+storage_account_name+".blob.core.windows.net",
      storage_account_access_key)
    df = spark.read.format(file_type).option("inferSchema", "true").load(file_location)

where file_location is my blob file (https://xxxxxxxxxx.blob.core.windows.net).

I get the following error:

No filesystem named https

I tried using sc.textFile(file_location) to read in an rdd and get the same error.

simon_dmorias simon_dmorias · Accepted Answer · 2019-02-07T14:05:26

Your file_location should be in the format:

"wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net/<your-directory-name>"

See: https://docs.databricks.com/spark/latest/data-sources/azure/azure-storage.html

Azure Databricks - Unable to read simple blob storage file from notebook

3 Answers