0
votes

Currently I am running my App in Azure VM which connects to HDInsight Spark cluster with username as hdfs.

I have created HDInsight Spark cluster with primary storage type as Azure Storage alias WASB.

I believe following methods are available to authenticate WASB storage:-

  1. Storage Access Keys- I get Storage key from Azure Storage account and used as follows in core-site.xml
  <property>
    <name>fs.azure.account.key.StorageAccountName.blob.core.windows.net</name>
     <value>Storage access key here</value>
  </property>
  1. Shared Access Signature -

SAS token generation form - Referenter image description here

SAS token, Blob, File, Queue, File URLs here - Refer enter image description here

How can I use the above SAS credentials like as Storage Access Keys in core-site.xml or where it should be used in my application ?

1

1 Answers

1
votes

Accroding to the offical documents below, you can use the java libraries hadoop-azure & azure-storage with the account key or SAS of Azure Blob Storage to access the resource url wasb[s]://<containername>@<accountname>.blob.core.windows.net/<path> or hdfs://<namenodehost>/<path> on HDInsight filesystem.

  1. Hadoop Azure Support: Azure Blob Storage
  2. Use HDFS-compatible Azure Blob storage with Hadoop in HDInsight
  3. Use Azure Storage Shared Access Signatures to restrict access to data with HDInsight

So if you want to use HDFS API to access resources on HDInsight, please check the authentication configuration for Hadoop to know which kind of auth way can be used. Or you also directly use Azure Storage Client SDK for Java with account key or SAS token to access these resources.