1
votes

I have an application running on multiple k8s pods. Forgive me for my lack of knowledge about k8s pod, from my understanding k8s will route incoming traffic to a different pod just like a proxy.

What happened if my application is running a cron job that fetches data. Is the cron job gonna be called multiple times based on how much pod is running, my concern is that will be a data duplication because these pod will fetch the same data.

My question is how to avoid data duplication when a cron job fetches data? can these pod configured to become something like a worker? let's say the cron job is fetching 500 data. Given 5 pods, each pod will fetch 100 data.

1

1 Answers

0
votes

Ideally, it should be not like that way POD will be mostly running the workload like API or socket servers

For cronjobs there are other options n Kubernetes is especially for the cronjob.

There are cronjobs & jobs two things in Kubernetes.

https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/

cron jobs get executed as per the corn timing while jobs get executed when you create it using YAML.

Cronjob indirectly creates the job run time. you can check out using the kubectl get jobs & kubectl get cronjobs

can these pod configured to become something like a worker? let's say the cron job is fetching 500 data. Given 5 pods, each pod will fetch 100 data.

now this depends on your scenrio you can create the single cronjob which fetch all those 500 data and single pod will be running at a time or corn execution.

My question is how to avoid data duplication when a cron job fetches data?

Run that type of workload as Cronjob pod.

https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/