0
votes

I want to get access from Azure Databricks Cluster to Azure Data Lake Storage Gen2 via Service principal to get rid of storage account access keys
I follow https://docs.microsoft.com/en-us/azure/databricks/data/data-sources/azure/azure-datalake-gen2#--mount-an-azure-data-lake-storage-gen2-account-using-a-service-principal-and-oauth-20
..but is says that storage account access key is still used: enter image description here

So what's the purpose os service accounts if storage account access key are still required?
And the main question - is it possible to get completely rid of storage account access keys and use service principal only?

1

1 Answers

2
votes

This is a document bug, currently I'm working on the immediate fix.

It should be dbutils.secrets.get(scope = "<scope-name>", key = "<key-name-for-service-credential>") retrieves your service-credential that has been stored as a secret in a secret scope.

Python: Mount Azure Data Lake Storage Gen2 filesystem by passing direct values

configs = {"fs.azure.account.auth.type": "OAuth",
       "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
       "fs.azure.account.oauth2.client.id": "0xxxxxxxxxxxxxxxxxxxxxxxxxxf", #Enter <appId> = Application ID
       "fs.azure.account.oauth2.client.secret": "Arxxxxxxxxxxxxxxxxxxxxy7].vX7bMt]*", #Enter <password> = Client Secret created in AAD
       "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/72fxxxxxxxxxxxxxxxxxxxxxxxxb47/oauth2/token", #Enter <tenant> = Tenant ID
       "fs.azure.createRemoteFileSystemDuringInitialization": "true"}

dbutils.fs.mount(
source = "abfss://[email protected]/flightdata", #Enter <container-name> = filesystem name <storage-account-name> = storage name
mount_point = "/mnt/flightdata",
extra_configs = configs)

enter image description here

Python: Mount Azure Data Lake Storage Gen2 filesystem by passing as secret in a secret scope using dbutils secrets.

configs = {"fs.azure.account.auth.type": "OAuth",
           "fs.azure.account.oauth.provider.type": "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
           "fs.azure.account.oauth2.client.id": "06xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx0ef",
           "fs.azure.account.oauth2.client.secret": dbutils.secrets.get(scope = "chepra", key = "service-credential"),
           "fs.azure.account.oauth2.client.endpoint": "https://login.microsoftonline.com/72xxxxxxxxxxxxxxxxxxxx011db47/oauth2/token"}

dbutils.fs.mount(
source = "abfss://[email protected]/flightdata", 
mount_point = "/mnt/flightdata",
extra_configs = configs)

enter image description here

Hope this helps. Do let us know if you any further queries.