I’m finally dipping my toes in the kubernetes pool and wanted to get some advice on the best way to approach a problem I have:
Tech we are using:
- GCP
- GKE
- GCP Pub/Sub
We need to do bursts of batch processing spread out across a fleet and have decided on the following approach:
- New raw data flows in
- A node analyses this and breaks the data up into manageable portions which are pushed onto a queue
- We have a cluster with Autoscaling On and Min Size ‘0’
- A Kubernetes job spins up a pod for each new message on this cluster
- When pods can’t pull anymore messages they terminate successfully
The question is:
- What is the standard approach for triggering jobs such as this?
- Do you create a new job each time or are jobs meant to be long lived and re-run?
- I have only seen examples of using a yaml file however we would probably want the node which did the portioning of work to create the job as it knows how many parallel pods should be run. Would it be recommended to use the python sdk to create the job spec programatically? Or if jobs are long lived would you simply hit the k8 api and modify the parallel pods required then re-run job?