0
votes

I am not able to run HDInsightSpark Python activity in Data Factory calling a SPARK/HDINSIGHT cluster. Do I need to change any configuration in the default Azure Spark Cluster? I have tried creating a dedicated Queue, and I see the job in the new queue but I still get the same error. This error seems to be common but there does not appear to be any fix that resolves it. All Jupyter notebooks with python do work, but they get stuck using Data Factory. The correct .py file shows in the yarn logs, so the job is getting picked correctly.

1
Fri Sep 29 02:46:07 +0000 2017] Application is Activated, waiting for resources to be assigned for AM. Details : AM Partition = <DEFAULT_PARTITION> ; Partition Resource = <memory:172032, vCores:48> ; Queue's Absolute capacity = 50.0 % ; Queue's Absolute used capacity = 0.0 % ; Queue's Absolute max capacity = 50.0 %RedMess

1 Answers

0
votes

You don't need to change any configuration in the default Azure Spark cluster for ADF to submit jobs to it. The message "waiting for AM container to be allocated, launched and register with RM" indicates your Spark cluster is not in a correct state. You can check your cluster's memory and disk settings as per: MapReduce job hangs, waiting for AM container to be allocated