2
votes

I have an Azure Data Lake Gen2 with public endpoint and a standard Azure ML instance. I have created both components with my user and I am listed as Contributor.

I want to use data from this data lake in Azure ML.

I have added the data lake as a Datastore using Service Principal authentication.

I then try to create a Tabular Dataset using the Azure ML GUI I get the following error:

Access denied You do not have permission to the specified path or file.

{
  "message": "ScriptExecutionException was caused by StreamAccessException.\n  StreamAccessException was caused by AuthenticationException.\n    'AdlsGen2-ListFiles (req=1, existingItems=0)' for '[REDACTED]' on storage failed with status code 'Forbidden' (This request is not authorized to perform this operation using this permission.), client request ID '1f9e329b-2c2c-49d6-a627-91828def284e', request ID '5ad0e715-a01f-0040-24cb-b887da000000'. Error message: [REDACTED]\n"
}

I have tried having our Azure Portal Admin, with Admin access to both Azure ML and Data Lake try the same and she gets the same error.

I tried creating the Dataset using Python sdk and get a similar error:

ExecutionError: 
Error Code: ScriptExecution.StreamAccess.Authentication
Failed Step: 667ddfcb-c7b1-47cf-b24a-6e090dab8947
Error Message: ScriptExecutionException was caused by StreamAccessException.
  StreamAccessException was caused by AuthenticationException.
    'AdlsGen2-ListFiles (req=1, existingItems=0)' for 'https://mydatalake.dfs.core.windows.net/mycontainer?directory=mydirectory/csv&recursive=true&resource=filesystem' on storage failed with status code 'Forbidden' (This request is not authorized to perform this operation using this permission.), client request ID 'a231f3e9-b32b-4173-b631-b9ed043fdfff', request ID 'c6a6f5fe-e01f-0008-3c86-b9b547000000'. Error message: {"error":{"code":"AuthorizationPermissionMismatch","message":"This request is not authorized to perform this operation using this permission.\nRequestId:c6a6f5fe-e01f-0008-3c86-b9b547000000\nTime:2020-11-13T06:34:01.4743177Z"}}
| session_id=75ed3c11-36de-48bf-8f7b-a0cd7dac4d58

I have created Datastore and Datasets of both a normal blob storage and a managed sql database with no issues and I have only contributor access to those so I cannot understand why I should not be Authorized to add data lake. The fact that our admin gets the same error leads me to believe there are some other issue.

I hope you can help me identify what it is or give me some clue of what more to test.

Edit: I see I might have duplicated this post: How to connect AMLS to ADLS Gen 2? I will test that solution and close this post if it works

1

1 Answers

1
votes

This was actually a duplicate of How to connect AMLS to ADLS Gen 2?.

The solution is to give the service principal that Azure ML uses to access the data lake the Storage Blob Data Reader access. And note you have to wait at least some minutes for this to have effect.