1
votes

So I have created an HDInsight Spark Cluster. I want it to access Azure Data Lake Store.

To create the HDInsight Spark cluster I followed the instructions at: https://azure.microsoft.com/en-gb/documentation/articles/data-lake-store-hdinsight-hadoop-use-portal however there was no option in the Azure Portal to configure the AAD or add a Service Principle.

So my cluster was created using Azure Blob Storage only. Now I want to extend it to access Azure Data Lake Store. However the "Cluster AAD Identity" dialog states "Service Principal: DISABLED" and all fields in the dialog are greyed our and disabled. I can't see any way to extend the storage to point to ADL.

Any help would be appreciated! Thanks :-)

3

3 Answers

1
votes

You can move your data from Blob to ADLS with Data Factory, but you can't access direct to ADLS from a Spark cluster.

1
votes

Which type of cluster did you create? In our Linux cluster all the option listed in the guide you linked are available.

1
votes

Please create an Azure Hdinsight cluster with ServicePrincipal. ServicePrincipal should have access to your data lake storage account. You can configure your cluster to use Data lake storage but that is very complicated. And in fact there is no documentation for that. So recommended way to create is with ServicePrincipal.