1
votes

I have a Kubernetes pod which downloading several types of files (let’s say X, Y and Z), and I have some processing scripts (each one is in a docker image) which are interested in one or more files (let's say processor_X_and_Y, processor_X_and_Z and processor_Z).

The first pod is always running, and I need to create a processor pod after downloading a file according to the file type, for example if the downloader downloads a file of type Z, I need to create a new instance of processor_X_and_Z and a new instance of processor_Z.

My current idea is to use Argo workflow by creating a simple workflow from 1 step for each processor, then starting the suitable workflows by calling the Argo REST API from the downloader pod. Thus I have achieved my goal and the auto-scaling of my system.

My question is is there another simpler engine or a service in Kubernetes which I can use to create a new prod from another pod without using this workflow engine?

2
How come you don't use the Kubernetes API to create your pods? Seems like you are using a CI/CD tool to manage your administrative workloads. Not sure that would be the right tool for the job.nodox
@nodox Argo does have a CI/CD tool, but there are also some other features that may be relevant.Michael Crenshaw
It might be a more scalable approach to put the individual jobs into a queue system like RabbitMQ, and have workers consume the jobs from the queue. You don't need to deal with Kubernetes specifics or RBAC to test out this approach in a development environment, and you don't risk flooding your cluster when you suddenly get 10,000 jobs all at once.David Maze

2 Answers

0
votes

You simply have to give your pod access to the api-server running on the control plane. That'll enable it to create/edit/delete pods by using kubectl or any other k8s library. You may want to use RBAC to restrict its permissions to the minimum required for the task at hand.

0
votes

As mentioned in another answer, you can give your pod access to the Kubernetes API and then apply a Pod resource via kubectl.

If you want to start an Argo Workflow, you could use kubectl to apply a Workflow resource, or you could use the Argo CLI.

But if you're using Argo anyway, you might find it easier to use Argo Events to kick off a Workflow. You would have to choose an event source based on how/from where you're downloading the source files. If, for example, the files are on S3, you could use the SNS event source.

If you just need to periodically check for new files, you could use a CronWorkflow to perform the check and conditionally perform the rest of the workflow based on whether there's anything to download.