0
votes

I have Storage account kagsa1 with container cont1 inside and need it to accessible (mounted) via Databricks

If I use storage account key in KeyVault it works correctly:

configs = {
    "fs.azure.account.key.kagsa1.blob.core.windows.net":dbutils.secrets.get(scope = "kv-db1", key = "storage-account-access-key")
}

dbutils.fs.mount(
  source = "wasbs://[email protected]",
  mount_point = "/mnt/cont1",
  extra_configs = configs)

dbutils.fs.ls("/mnt/cont1")

..but if I'm trying to connect using Azure Active Directory credentials:

configs = {
"fs.azure.account.auth.type": "CustomAccessToken",
"fs.azure.account.custom.token.provider.class": spark.conf.get("spark.databricks.passthrough.adls.gen2.tokenProviderClassName")
}

dbutils.fs.ls("abfss://[email protected]/")

..it fails:

ExecutionError: An error occurred while calling z:com.databricks.backend.daemon.dbutils.FSUtils.ls.
: GET https://kagsa1.dfs.core.windows.net/cont1?resource=filesystem&maxResults=5000&timeout=90&recursive=false
StatusCode=403
StatusDescription=This request is not authorized to perform this operation using this permission.
ErrorCode=AuthorizationPermissionMismatch
ErrorMessage=This request is not authorized to perform this operation using this permission.

Databrics Workspace tier is Premium,
Cluster has Azure Data Lake Storage Credential Passthrough option enabled,
Storage account has hierarchical namespace option enabled,
Filesystem was initialized with

spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "true")
dbutils.fs.ls("abfss://[email protected]/")
spark.conf.set("fs.azure.createRemoteFileSystemDuringInitialization", "false")

and I have full access to container in storage account: enter image description here

What am I doing wrong?

1

1 Answers

1
votes

Note: When performing the steps in the Assign the application to a role, make sure to assign the Storage Blob Data Contributor role to the service principal.

As part of repro, I have provided owner permission to the service principal and tried to run the “dbutils.fs.ls("mnt/azure/")”, returned same error message as above.

enter image description here

Now assigned the Storage Blob Data Contributor role to the service principal.

enter image description here

Finally, able to get the output without any error message after assigning Storage Blob Data Contributor role to the service principal.

enter image description here

For more details, refer “Tutorial: Azure Data Lake Storage Gen2, Azure Databricks & Spark”.

Reference: Azure Databricks - ADLS Gen2 throws 403 error message.