How do you represent multi-process logical entities in OTP?

Question

Imagine we have the following problem:

We have http clients that execute requests to our software. So we have one process that is always available to them and stores their requests in a queue.
We need to dispatch these requests to a machine that is in our internal network (again via HTTP).
Such a machine is not always available. It is started (and stopped when the queue is empty) on demand by our software (again HTTP request to a "manager" machine).
We have several (or lots) of the above.

So basically, we have one logical entity, that for the sake of argument, we will call a "job queue". Every job queue consists of several (heterogenous) processes. One that implements the actual queue and is always available (doesn't block). One that manages a worker machine. We also have several (spawned on demand) workers, that take entries off the queue, try to send them to the worker machine, work around errors; maybe return (unsuccessful) attempts to the queue (to be retried) etc. And we maybe also have a "manager" process that coordinates the work of the above. And we have lots of "job queues" who all consist of lots of processes.

NOTE: this may not be the perfect solution to this exact problem, but let's assume that it is. My question is not about how to solve the problem, but how to manage such "groups" of processes that represent logical entities.

So, how do you represent this in OTP? How many supervision trees do you have, do you share supervisors between "job queue" entities, or do you have a supervisor per logical entity. Also, how do you manage the whole thing.

I have a guess, but this is quite a tricky problem (as I already tried implementing it in several different ways), so I won't share my (maybe not so bad) idea (for now).

user425720 user425720 · Accepted Answer · 2011-11-08T18:15:10

I would use dedicated supervisor for each logical component (I guess you mean by logical: http-workers, manager, dispatchers). Each of those would have supervisor over one of those classes. I like it, because I can benefit from additional tools to control it (count children, see it in i(). etc.) and it nicely separates the system.

Gproc mentioned by @MinimeDJ and sync/async stuff is completely different thing.

I think it is not the best architecture if you need in system you described to use gproc. Redesign it to have as much as possible stateless layers. E.g. in stead of maintaining dispatchers = push model, try pull model = pull tasks from back-end machine. This solution makes queues stateless, you get rid of dispatchers and if anything goes wrong backend layer puts task again in some queue. Moreover Managers are just reduced to API to queues and some stats collectors. Load of back-end workers is measured and controlled (localy!) in each of those heterogeneous back-end modules.

How do you represent multi-process logical entities in OTP?

2 Answers