0
votes

Basically, I am trying to test how this on-demand cluster spins up and works with Hive activity. So there is only one activity in the pipeline.

But Azure data factory shows the below message when I try to debug it for Hive activity using an on-demand HDInsight cluster. {"code":"BadRequest","message":null,"target":"pipeline//runid/cXXX-XXXX-XXXXX-1111","details":null,"error":null}

When I checked in debug section of the pipeline runs it says "Operation on target pipeline failed: Invalid linked service reference. Name: storage_linkedservice"

The on-demand HDInsight cluster's linked service is configured with Dynamic Content in Json Format and not through UI. Reference for this Json is taken from Microsoft docs link: https://docs.microsoft.com/en-us/azure/data-factory/compute-linked-services#azure-hdinsight-on-demand-linked-service

This document says how we can attach additional storage accounts under the title additionalLinkedServiceNames JSON example.

And I have used the same JSON format to specify additional storage account for which the data factory throws an error. The JSON format is as below.

"additionalLinkedServiceNames": [ { "referenceName": "storage_linkedservice", "type": "LinkedServiceReference" } ]

Does anyone have any idea that why the pipeline is failing even though the format is specified as mentioned in Microsoft's document?

The reply is very much appreciated.

Thanks.

1
I have updated my answer, but can you share the exact code that you are using and not just the syntax. Snips would help tooKarthikBhyresh-MT

1 Answers

0
votes

--Update

Refer: Error message - "code":"BadRequest", "message":"null"

Cause

It is a user error because JSON payload that hits management.azure.com is corrupt. No logs will be stored because user call did not reach ADF service layer.

Resolution

Perform network tracing of your API call from ADF portal using Edge/Chrome browser Developer tools. You will see offending JSON payload, which could be due to a special character(for example $), spaces and other types of user input. Once you fix the string expression, you will proceed with rest of ADF usage calls in the browser.

--

Currently, you cannot specify an Azure Data Lake Storage (Gen 2) linked service for this property. If the HDInsight cluster has access to the Data Lake Store, you may access data in the Azure Data Lake Storage (Gen 2) from Hive/Pig scripts.

Please verify if your named linked service is created under your Azure subscription and you are able to see the cluster in your Azure portal when the cluster is up and running. It typically takes 20 minutes or more to provision an Azure HDInsight cluster on demand. Wait and then run debug.

Few pointers:

  • Verify that the credential in Linked Service is valid, and has permission to access
  • Since you mention that you see this error while debug, can you try and confirm if it is same when you use triggers
  • The storage account must be a general-purpose standard Azure Storage account. Must be in the same region as the HDInsight cluster, which is created in the same region as the storage account specified by linkedServiceName.
  • In HDInsight compute environment, On-demand additionalLinkedServiceNames is supported for only Blob, ADLS Gen 1, ADLS Gen2 and Azure SQL DB are not supported.