0
votes

I am really confused here as a newbie to Celery and would like some inputs on how celery with message broker as rabbitmq are handling task dependencies when client (ipython) submits a DAG using Celery Canvas (workflow) primitives like Chain, Chord and Group.

Let's say I have a DAG where A --> (B, C) --> D (Diamond DAG) where A will run first followed by B and C in parallel and output of B and C will be used as input in D.

I was able to come up with a DAG i.e.

wf = (A.s(2, 2) | group(B.s(), C.s(25)) | D.s(1000)).delay()

My understanding is that 1) client submits a DAG 2) Celery converts this DAG into a message (usually in JSON) 3) Sends this message to message broker 4) Message broker has some idea about dependencies? (not sure) and based on dependencies puts the task in queue 5) Workers who have subscribed to the queue picks up the task and execute.

I am confused as to who is the incharge here to make sure that workers execute the tasks according to the dependency. Do celery workers have an idea about dependency? Is there some kinda metadata management that broker does?

Any inputs here will be appreciated. Thanks!

1

1 Answers

0
votes

Actually celery use kombu under the hood as message transport mechanism and there are few more dependencies of celery who do different job for celery to make celery the high level framework not to be aware of many/diverse task for end users/developers. You just use celery for tasks and others and under the hood specific libraries like kombu, billiard etc do their tasks. You better check the source code of celery to have more insight.