I am trying to run a spark step on AWS Data-pipeline. I am getting the following exception:-
amazonaws.datapipeline.taskrunner.TaskExecutionException: Failed to complete EMR transform. at amazonaws.datapipeline.activity.EmrActivity.runActivity(EmrActivity.java:67) at amazonaws.datapipeline.objects.AbstractActivity.run(AbstractActivity.java:16) at amazonaws.datapipeline.taskrunner.TaskPoller.executeRemoteRunner(TaskPoller.java:136) at amazonaws.datapipeline.taskrunner.TaskPoller.executeTask(TaskPoller.java:105) at amazonaws.datapipeline.taskrunner.TaskPoller$1.run(TaskPoller.java:81) at private.com.amazonaws.services.datapipeline.poller.PollWorker.executeWork(PollWorker.java:76) at private.com.amazonaws.services.datapipeline.poller.PollWorker.run(PollWorker.java:53) at java.lang.Thread.run(Thread.java:748) Caused by: amazonaws.datapipeline.taskrunner.TaskExecutionException: EMR job '@DefaultEmrActivity1_2017-11-20T12:13:08_Attempt=1' with jobFlowId 'j-2E7PU1OK3GIJI' is failed with status 'FAILED' and reason 'Cluster ready after last step completed.'. Step 'df-0693981356F3KEDFQ6GG_@DefaultEmrActivity1_2017-11-20T12:13:08_Attempt=1' is in status 'FAILED' with reason 'null' at amazonaws.datapipeline.cluster.EmrUtil.runSteps(EmrUtil.java:286) at amazonaws.datapipeline.activity.EmrActivity.runActivity(EmrActivity.java:63) ... 7 more
The cluster is getting spun up correctly.
Here is the screenshot of the pipeline:-
I think there is some issue with the 'step' in activity. Any input would be helpful.
Step 'df-0693981356F3KEDFQ6GG_@DefaultEmrActivity1_2017-11-20T12:13:08_Attempt=1' is in status 'FAILED' with reason 'null'
. Can you access to the logs on S3? – Alexandre Dupriez