I am trying to run pig jobs on managed DataProc cluster. I have several independent pig jobs that run in parallel. I have set continueOnFailure for each jobs to be true. Now, if one of the job fails all the others are stopped and the cluster is terminated. I dont want that, I want the failing job to be terminated and the other jobs to run as expected.
The yaml file through which I am instantiating is as below:
jobs:
- pigJob:
continueOnFailure: true
queryList:
queries:
- sh pqr.sh
stepId: run-pig-pqr
- pigJob:
continueOnFailure: true
queryList:
queries:
- sh abc.sh
stepId: run-pig-abc
placement:
managedCluster:
clusterName: batch-job
config:
gceClusterConfig:
zoneUri: asia-south1-a
masterConfig:
machineTypeUri: n1-standard-8
diskConfig:
bootDiskSizeGb: 50
workerConfig:
machineTypeUri: n2-highcpu-64
numInstances: 2
diskConfig:
bootDiskSizeGb: 50
softwareConfig:
imageVersion: 1.4-ubuntu18
I am creating the cluster with command
gcloud dataproc workflow-templates instantiate-from-file --file $file-name.yaml
I am giving any wrong config in my yaml ?