Kubernetes dynamic Job scaling

Question

I’m finally dipping my toes in the kubernetes pool and wanted to get some advice on the best way to approach a problem I have:

Tech we are using:

GCP
GKE
GCP Pub/Sub

We need to do bursts of batch processing spread out across a fleet and have decided on the following approach:

New raw data flows in
A node analyses this and breaks the data up into manageable portions which are pushed onto a queue
We have a cluster with Autoscaling On and Min Size ‘0’
A Kubernetes job spins up a pod for each new message on this cluster
When pods can’t pull anymore messages they terminate successfully

The question is:

What is the standard approach for triggering jobs such as this?
- Do you create a new job each time or are jobs meant to be long lived and re-run?
I have only seen examples of using a yaml file however we would probably want the node which did the portioning of work to create the job as it knows how many parallel pods should be run. Would it be recommended to use the python sdk to create the job spec programatically? Or if jobs are long lived would you simply hit the k8 api and modify the parallel pods required then re-run job?

This is a bit generic/design question IMHO and does not meet standards of question on SO usually. You will have to ask some specific questions and show work you have done to get help. — Vishal Biyani
Sorry but that architecture is horrible. You are trying to code with infrastructure which will be very expensive and overkill. Your data pipeline problem is easily solvable adopting kafka in your pipeline. Stream-service -> kafka-consumer -> kafka-broker -> Multiple-kafka-consumers -> kafka-producer -> wherever you want In this pipeline you can scale increasing the number of consumers per consumer group or adding partitions to your topic. — Rodrigo Loza
@RodrigoLoza: Your response is formulated in a very negative fashion. Additionally, your suggestion is highly opinionated and is neither right nor wrong, it is just one of a pool of potential solutions, and does not seem to have any significant advantages in this case. — chaosaffe
I agree, there are a billion ways to solve your problem. Build your app and check it out for yourself. There is a reason why most companies adopt this pipeline. — Rodrigo Loza
The design will depend heavily on a few things: how many jobs need to run in parallel? what latency can you afford (do you need the job to run as fast as possible and return a result)? How long does a job typically take (does it take ms,sec,minutes?) Spinning a pod up and down is not instantaneous, if your jobs take minutes, it's an option, but if your job takes less than a few sec to run, spinning a k8s Job for each job will end up being much slower. Have you looked at Cloud Functions for your workload? They take all the burden of scheduling/scaling for you, but it has some latency. — MrE

chaosaffe chaosaffe · Accepted Answer · 2019-11-08T19:38:38

Jobs in Kubernetes are meant to be short-lived and are not designed to be reused. Jobs are designed for run-once, run-to-completion workloads. Typically they are be assigned a specific task, i.e. to process a single queue item.

However, if you want to process multiple items in a work queue with a single instance then it is generally advisable to instead use a Deployment to scale a pool of workers that continue to process items in the queue, scaling the number of pool workers dependent on the number of items in the queue. If there are no work items remaining then you can scale the deployment to 0 replicas, scaling back up when there is work to be done.

To create and control your workloads in Kubernetes the best-practice would be to use the Kubernetes SDK. While you can generate YAML files and shell out to another tool like kubectl using the SDK simplifies configuration and error handling, as well as allowing for simplified introspection of resources in the cluster as well.

Kubernetes dynamic Job scaling

1 Answers