0
votes

How to delete the failed jobs in the kubernetes cluster using a cron job in gke?. when i tried to delete the failed jobs using following YAML, it has deleted all the jobs (including running)


apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: XXX
namespace: XXX
spec:
schedule: "*/30 * * * *"
failedJobsHistoryLimit: 1
successfulJobsHistoryLimit: 1
jobTemplate:
 spec:
   template:
     spec:
       serviceAccountName: XXX
       containers:
       - name: kubectl-runner
         image: bitnami/kubectl:latest
         command: ["sh", "-c", "kubectl delete jobs $(kubectl get jobs | awk '$2 ~ 1/1' | awk '{print $1}')"]
       restartPolicy: OnFailure
1
How this job getting triggered? if cron then you can set .spec.failedJobsHistoryLimit. and If this are normal one can't you just check COMPLETIONS fieldyogesh kunjir
I agree what was said by user yogesh kunjir. You should be able to set a limit for your failed jobs in the CronJob. You can also look on this stackoverflow answer: stackoverflow.com/questions/53539576/… . You will need to modify it it support "Failed" Jobs.Dawid Kruk
@yogeshkunjir actually the above yaml is a cron-job, which is trying to delete failed normal jobs. And i have already tried the following , but it's not deleting the jobs. kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(@.status.Failed==1)].metadata.name}')sundara moorthy k s

1 Answers

1
votes

To delete failed Jobs in GKE you will need to use following command:

  • $ kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(@.status.failed==1)].metadata.name}')

This command will output the JSON for all jobs and search for jobs that have status.failed field set to 1. It will then pass the failed jobs to $ kubectl delete jobs


This command ran in a CronJob will fail when there are no jobs with status: failed.

As a workaround you can use:

command: ["sh", "-c", "kubectl delete job --ignore-not-found=true $(kubectl get job -o=jsonpath='{.items[?(@.status.failed==1)].metadata.name}'); exit 0"]

exit 0 was added to make sure that the Pod will leave with status code 0


As for part of the comments made under the question:

You will need to modify it it support "Failed" Jobs

I have already tried the following , but it's not deleting the jobs. kubectl delete job $(kubectl get job -o=jsonpath='{.items[?(@.status.Failed==1)].metadata.name}')

  • @.status.Failed==1 <-- incorrect as JSON is case sensitive
  • @.status.failed==1 <-- correct

If you were to run incorrect version of this command on following Pods (to show that they failed and aren't still running to completion):

NAME              READY   STATUS      RESTARTS   AGE
job-four-9w5h9    0/1     Error       0          5s
job-one-n9trm     0/1     Completed   0          6s
job-three-nhqb6   0/1     Error       0          5s
job-two-hkz8r     0/1     Error       0          6s

You should get the following error :

error: resource(s) were provided, but no name, label selector, or --all flag specified

Above error will also show when there was no jobs passed to $ kubectl delete job.

Running correct version of this command should delete all jobs that failed:

job.batch "job-four" deleted
job.batch "job-three" deleted
job.batch "job-two" deleted

I encourage you to check additional resources: