Imagine we have the following problem:
- We have http clients that execute requests to our software. So we have one process that is always available to them and stores their requests in a queue.
- We need to dispatch these requests to a machine that is in our internal network (again via HTTP).
- Such a machine is not always available. It is started (and stopped when the queue is empty) on demand by our software (again HTTP request to a "manager" machine).
- We have several (or lots) of the above.
So basically, we have one logical entity, that for the sake of argument, we will call a "job queue". Every job queue consists of several (heterogenous) processes. One that implements the actual queue and is always available (doesn't block). One that manages a worker machine. We also have several (spawned on demand) workers, that take entries off the queue, try to send them to the worker machine, work around errors; maybe return (unsuccessful) attempts to the queue (to be retried) etc. And we maybe also have a "manager" process that coordinates the work of the above. And we have lots of "job queues" who all consist of lots of processes.
NOTE: this may not be the perfect solution to this exact problem, but let's assume that it is. My question is not about how to solve the problem, but how to manage such "groups" of processes that represent logical entities.
So, how do you represent this in OTP? How many supervision trees do you have, do you share supervisors between "job queue" entities, or do you have a supervisor per logical entity. Also, how do you manage the whole thing.
I have a guess, but this is quite a tricky problem (as I already tried implementing it in several different ways), so I won't share my (maybe not so bad) idea (for now).