2
votes

I'm trying to create Azure HDInsight with Data lake using Template Deployment. But I'm Facing an issue in executing the template because of what i think the reason would be the "Service Principle Name" integration with azure data lake store.

Error:

"message": "DeploymentDocument 'AmbariConfiguration_1_7' failed the validation. Error: 'Error while getting access to the datalake storage account demodls: Error while getting the OAuth token from AAD for AppPrincipalId XXXXXX-XXXXXXXXX-XXXXX-XXX-XXXXX.

Please find below screenshot for more details.

enter image description here

I have tried creating AD webapp and assigned "Owner" role for the app. Then I have assigned it to Subscription's owner. Then added "Data Lake Permission" for the app. but still I think I might be missing.

Cluster Integration Snippet

"properties": {
                "clusterVersion": "[parameters('clusterVersion')]",
                "osType": "Linux",
                "tier": "standard",
                "clusterDefinition": {
                    "kind": "[parameters('clusterKind')]",
                    "configurations": {
                        "gateway": {
                            "restAuthCredential.isEnabled": true,
                            "restAuthCredential.username": "[parameters('clusterLoginUserName')]",
                            "restAuthCredential.password": "[parameters('clusterLoginPassword')]"
                        },
                        "core-site": {
                            "fs.defaultFS": "adl://home",
                            "dfs.adls.home.hostname": "demodls.azuredatalakestore.net",
                            "dfs.adls.home.mountpoint": "/clusters/democluster/"
                        },
                        "clusterIdentity": {
                            "clusterIdentity.applicationId": "XXXXX-XXXXX-XXXXX-XXXXX-XXXXX",
                            "clusterIdentity.certificate": "[parameters('identityCertificate')]",
                            "clusterIdentity.aadTenantId": "https://login.windows.net/XXXXXXXX-XXXX-XXXX-XXXXX-XXXXXXXXXX",
                            "clusterIdentity.resourceUri": "https://management.core.windows.net/",
                            "clusterIdentity.certificatePassword": "[parameters('identityCertificatePassword')]"
                        }
                    }
                },

Here I have few doubts like

  1. Is "SecureString" values like clusterpassword,sshpassword in "parameter.json" should be given as plaintext or i have to convert it into Securestring and give the secure string value to it?

  2. The field "identityCertificate" should be "base64" encoded of "Certificate.pfx" file content or I'll have to convert it as Base64 -> SecureString and give it in parameter.json?

Help appreciated much ! Thanks

Regards

2

2 Answers

2
votes

identityCertificate should be the base64-encoded string representation of the contents of the certificate .pfx file. It's labeled as type SecureString in the ARM template definition file so that the plaintext isn't stored/returned when you fetch deployment histories going forward. Marking fields with SecureString helps make sure that passwords and other such fields don't get persisted in your deployment history.

One easy way to troubleshoot how you're authoring your cluster creation ARM template is to go to the Azure Portal, and create the cluster just as you want in the template. Just before clicking 'Create' on the 'Summary' step, download the ARM template to see what's getting deployed. There's a link next to 'Create' to do this.

I expect you'll notice differences in how you're specifying your primary ADLS account. Go with how it's configured in the downloaded ARM template, and you should be good to go.

0
votes

@Matt H

I have downloaded the template generated at portal when we create HDInsight even then it still wont work.

Please find my below powershell script.

 //To Create Resources
 $resourceGroupName = "demoesprg"
 New-AzureRmResourceGroup -Name $resourceGroupName -Location "East US 2"
 $dataLakeStoreName = "demoespdls"
 New-AzureRmDataLakeStoreAccount -ResourceGroupName $resourceGroupName -Name $dataLakeStoreName -Location "East US 2"
 Test-AzureRmDataLakeStoreAccount -Name $dataLakeStoreName
 $myrootdir = "/"
 New-AzureRmDataLakeStoreItem -Folder -AccountName $dataLakeStoreName -Path $myrootdir/clusters/demoespcluster

 $templatefilepath = "C:\Azure-saml\template.json"
 $SSHpass = ConvertTo-SecureString -String "Demoesp1234$" -AsPlainText -Force

  //Create .pfx certificate
 $certFolder = "C:\Azure-saml\certs"
 $certFilePath = "$certFolder\demoespcert.pfx"
 $certStartDate = (Get-Date).Date
 $certStartDateStr = $certStartDate.ToString("MM/dd/yyyy")
 $certEndDate = $certStartDate.AddYears(1)
 $certEndDateStr = $certEndDate.ToString("MM/dd/yyyy")
 $certName = "demoespcert"
 $certPassword = "democert123$"
 $certPasswordSecureString = ConvertTo-SecureString $certPassword -AsPlainText -Force 
 $cert = New-SelfSignedCertificate -DnsName $certName -CertStoreLocation cert:\CurrentUser\My 
 $certThumbprint = $cert.Thumbprint
 $cert = (Get-ChildItem -Path cert:\CurrentUser\My\$certThumbprint) 
 Export-PfxCertificate -Cert $cert -FilePath $certFilePath -Password $certPasswordSecureString 
 $certificatePFX = New-Object System.Security.Cryptography.X509Certificates.X509Certificate2($certFilePath, $certPasswordSecureString)
 $credential = [System.Convert]::ToBase64String($certificatePFX.GetRawCertData())

 //create ActiceDriectory Application
 $application = New-AzureRmADApplication `
     -DisplayName "ESPSPN" `
     -HomePage "https://demoespcluster.hdinsight.net" `
     -IdentifierUris "https://demoespcluster.hdinsight.net" `
     -CertValue $credential  `
     -StartDate $certificatePFX.NotBefore  `
     -EndDate $certificatePFX.NotAfter 
 Start-Sleep -Seconds 20

 //Create Service Principla
 $applicationId = $application.ApplicationId
 $servicePrincipal = New-AzureRmADServicePrincipal -ApplicationId $applicationId
 $objectId = $servicePrincipal.Id

 //Assign Permissions
 Set-AzureRmDataLakeStoreItemAclEntry -AccountName $dataLakeStoreName -Path / -AceType User -Id $objectId -Permissions All
 Set-AzureRmDataLakeStoreItemAclEntry -AccountName $dataLakeStoreName -Path /clusters -AceType User -Id $objectId -Permissions All
 Set-AzureRmDataLakeStoreItemAclEntry -AccountName $dataLakeStoreName -Path /clusters/demoespcluster -AceType User -Id $objectId -Permissions All


 //Execute Scripts
 $tenantID = (Get-AzureRmContext).Tenant.TenantId
 $secureCert = [System.Convert]::ToBase64String((Get-Content $certFilePath -Encoding Byte))
 //$dsecureCert = ConvertTo-SecureString $secureCert -AsPlainText -Force

 New-AzureRmResourceGroupDeployment `
    -ResourceGroupName $resourceGroupName `
    -TemplateFile $templatefilepath `
    -identityCertificate $secureCert `
    -identityCertificatePassword $certPasswordSecureString `
    -clusterName  $certName `
    -clusterLoginPassword $SSHpass `
    -sshPassword $SSHpass `
    -servicePrincipalApplicationId $applicationId

Error:

New-AzureRmResourceGroupDeployment : 11:15:00 PM - DeploymentDocument 'AmbariConfiguration_1_7' failed the validation. Error: 'Error while getting access to the datalake storage account demoespdls: Access denied.

What Am i missing here?

UPDATE: The Script is right but my self signed certificate had a problem. Once used a valid certificate, I was able to create the cluster successfully!! Thanks.