0
votes

I 'm looking at Microsoft Documentation here and here, I have created Web App in Azure Active Directory to access the Data Lake Store

From the Web App I have Object ID, Application ID and Key

looking at the documentations I see this:

adlCreds = lib.auth(tenant_id = 'FILL-IN-HERE', client_secret = 'FILL-IN-HERE', client_id = 'FILL-IN-HERE', resource = 'https://datalake.azure.net/')

how to use it to authenticate my code and run operation on Data Lake Store?

here is my full test code:

## Use this for Azure AD authentication
from msrestazure.azure_active_directory import AADTokenCredentials

## Required for Azure Data Lake Store account management
from azure.mgmt.datalake.store import DataLakeStoreAccountManagementClient
from azure.mgmt.datalake.store.models import DataLakeStoreAccount

## Required for Azure Data Lake Store filesystem management
from azure.datalake.store import core, lib, multithread

# Common Azure imports
import adal
from azure.mgmt.resource.resources import ResourceManagementClient
from azure.mgmt.resource.resources.models import ResourceGroup

## Use these as needed for your application
import logging, getpass, pprint, uuid, time


## Declare variables
subscriptionId = 'FILL-IN-HERE'
adlsAccountName = 'FILL-IN-HERE'

tenant_id = 'FILL-IN-HERE'
client_secret = 'FILL-IN-HERE'
client_id = 'FILL-IN-HERE'


## adlCreds = lib.auth(tenant_id = 'FILL-IN-HERE', client_secret = 'FILL-IN-HERE', client_id = 'FILL-IN-HERE', resource = 'https://datalake.azure.net/')
from azure.common.credentials import ServicePrincipalCredentials
adlCreds = lib.auth(tenant_id, client_secret, client_id, resource = 'https://datalake.azure.net/')


## Create a filesystem client object
adlsFileSystemClient = core.AzureDLFileSystem(adlCreds, store_name=adlsAccountName)

## Create a directory
adlsFileSystemClient.mkdir('/mysampledirectory')

when I try to ru the code I get error:

[Running] python "c:....\dls.py" Traceback (most recent call last): File "c:....\dls.py", line 38, in adlCreds = lib.auth(tenant_id, client_secret, client_id, resource = 'https://datalake.azure.net/') File "C:\Python36\lib\site-packages\azure\datalake\store\lib.py", line 130, in auth password, client_id) File "C:\Python36\lib\site-packages\adal\authentication_context.py", line 145, in acquire_token_with_username_password return self._acquire_token(token_func) File "C:\Python36\lib\site-packages\adal\authentication_context.py", line 109, in _acquire_token return token_func(self) File "C:\Python36\lib\site-packages\adal\authentication_context.py", line 143, in token_func return token_request.get_token_with_username_password(username, password) File "C:\Python36\lib\site-packages\adal\token_request.py", line 280, in get_token_with_username_password self._user_realm.discover() File "C:\Python36\lib\site-packages\adal\user_realm.py", line 152, in discover raise AdalError(return_error_string, error_response) adal.adal_error.AdalError: User Realm Discovery request returned http error: 404 and server response:

404 - File or directory not found.

Server Error

404 - File or directory not found.

The resource you are looking for might have been removed, had its name changed, or is temporarily unavailable.

[Done] exited with code=1 in 1.216 seconds

1

1 Answers

4
votes

There are two different ways of authenticating. The first one is interactive which is suitable for end users. It even works with multi factor authentication. Here is how you do it. You need to be interactive in order to log on.

from azure.datalake.store import core, lib, multithread
token = lib.auth()

The second method is to use service principal identities in Azure Active directory. A step by step tutorial for setting up an Azure AD application, retrieving the client id and secret and configuring access using the SPI is available here: https://docs.microsoft.com/en-us/azure/data-lake-store/data-lake-store-service-to-service-authenticate-using-active-directory#create-an-active-directory-application

from azure.common.credentials import ServicePrincipalCredentials
token = lib.auth(tenant_id = '<your azure tenant id>', client_secret = '<your client secret>', client_id = '<your client id>')

Here is blog post that shows how to access it through pandas and Jupyter. It also has a step by step on how to get the authentication token. https://medium.com/azure-data-lake/using-jupyter-notebooks-and-pandas-with-azure-data-lake-store-48737fbad305