3
votes

I need to access Azure Files from Azure Databricks. According to the documentation Azure Blobs are supported but I am need this code to work with Azure files:

dbutils.fs.mount(
  source = "wasbs://<your-container-name>@<your-storage-account-name>.file.core.windows.net",
  mount_point = "/mnt/<mount-name>",
  extra_configs = {"<conf-key>":dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>")})

or is there another way to mount/access Azure Files to/from a Azure Databricks cluster? Thanks

2

2 Answers

4
votes

On Azure, generally you can mount a file share of Azure Files to Linux via SMB protocol. And I tried to follow the offical tutorial Use Azure Files with Linux to do it via create a notebook in Python to do the commands as below, but failed.

enter image description here

It seems that Azure Databricks does not allow to do that, even I searched about mount NFS, SMB, Samba, etc. in Databricks community that there is not any discussion.

So the only way to access files in Azure Files is to install the azure-storage package and directly to use Azure Files SDK for Python on Azure Databricks.

1
votes

Install Library: azure-storage-file-share https://pypi.org/project/azure-storage-file-share/

#Upload to Azure File Share

from azure.storage.fileshare import ShareFileClient
 
file_client = ShareFileClient.from_connection_string(conn_str="AZURE_STORAGE_CONNECTION_STRING", share_name="AZURE_STORAGE_FILE_SHARE_NAME", file_path="summary_uploaded.csv")
 
with open("/dbfs/tmp/summary_to_upload.csv", "rb") as source_file:
    file_client.upload_file(source_file)

#Download from Azure File Share

file_client = ShareFileClient.from_connection_string(conn_str="AZURE_STORAGE_CONNECTION_STRING", share_name="AZURE_STORAGE_FILE_SHARE_NAME", file_path="summary_to_download.csv")
 
with open("/dbfs/tmp/summary_downloaded.csv", "wb") as file_handle:
    data = file_client.download_file()
    data.readinto(file_handle)

Next steps:

  1. Define a new secret key in Azure Key Vault for holding the value for ‘conn_str’ (AZURE_STORAGE_CONNECTION_STRING). Key can be: az-storage-conn-string
  2. Define a new secret key in Azure Key Vault for holding the value for ‘share_name’ (AZURE_STORAGE_FILE_SHARE_NAME). Key: az-storage-file-share
  3. Read both of these keys from key vault and avoid hard-coding.