1
votes

Mostly in educational purposes I'm trying to write a task (task is an open_port({spawn_executable, Command})) scheduler.

I end up with the tree like

        supervisor
        |        |
scheduler        receiver
gen_event        gen_event
                     |
                supervisor
                     |
                dispatcher
                gen_server
                     |
                supervisor
                |    |   |
             task1  ... taskN

In other words:

  1. top supervisor starts scheduler and receiver and makes sure they will be alive
  2. receiver starts middle supervisor
  3. middle supervisor starts dispatcher and makes sure it will be alive
  4. dispatcher starts bottom supervisor
  5. bottom supervisor starts tasks upon request and makes sure they are restarted in case of error

  6. at any time scheduler is ready to accept a task with a timestamp it should be executed at

  7. when timestamp is met it notifies some event_manager
  8. receiver is then notified by the same event manager and passes the message to dispatcher through the middle supervisor
  9. dispatcher has some business logic that is why it is not stateless, for example, some kind of tasks cannot be executed simultaneously
  10. when all conditions are met dispatcher passes task to bottom supervisor which makes sure task is executed until normal exit is got or some thresold is bypassed
  11. bottom supervisor returns back a message which is then passed up-up-up to some event manager
  12. and scheduler eventually receives this message, removing task from its queue or reenqueueing it or something else

The questions are:

  1. Am I using behaviours right?
  2. Isn't the structure too complicated? (However, in future the system is going to become distributed.)
  3. Is there a way to combine receiver+middle supervisor and dispatcher+bottom supervisor in two modules instead of four implementing 4 behaviours in the same time?
  4. Or is there a way to combine receiver+dispatcher+bottom supervisor in one module, eliminating the need for middle supervisor, implementing gen_event+gen_server+supervisor behaviour at the same time?
  5. Am I mistaken thinking of behaviours as of interfaces or multi-inheritance in OO languages? (That makes me ask questions 3 and 4.)

Thanks in advance.

P. S. IMO, on one hand, the structure is too complicated; on the other hand such a structure lets me make any of its blocks distributed (for example, many schedulers to one receiver, one scheduler to many receivers, many schedulers to many receivers, many dispatchers for each receiver and even many bottom supervisors for each dispatcher - every layer with is own supervision policy). Where is the balance point between complexity and extensibility?

1
To me the role of the scheduler is not clear, for what do you need an extra scheduler?Peer Stritzinger
Consider a cron replacement to run tasks more precisely: more often than once a minute; with a business logic in/after running (i. e. after successfully running some task generate another task(s)); with custom error handling (restart/cancel/ignore/etc on error); distributed.trytrytry
If you want OTP supervisor tree, there have to be supervisor tree so never ever put supervisor under anything else than supervisor.Hynek -Pichi- Vychodil

1 Answers

1
votes

What I would suggest is simplifying your design much more like:

        supervisor
        |        |
 dispatcher      |
 +scheduler      |
                 |
            supervisor
            |    |   |
         task1  ... taskN

The there is not much gain from having a separate scheduler sending events to a dispatcher which starts tasks etc. Even in the light of distribution.

The dispatcher-scheduler can be probably quite simply done with the help of the timer module and can be a gen_server. Timer can either send messages which you can process in the handle_info callback or call api functions of your gen_server.

You could also use the timeout functionality to wake up the gen_server after the next interval that would be even simpler since you don't have to worry abou canceling timers when you add a new "task".

The dispatcher/scheduler then calls supervisor:start_child to add working tasks.

Distribution can be added easily: dispatcher/scheduler can be on a separate node than the second level supervisor. The tasks starting function can distribute further and maybe using the pool module for load balancing.

To answer your five questions:

  1. I suspect you are using gen_event where it is not needed, but since the modules itself are not needed its easily fixed by removing them. gen_event is if you want to be able to register many handlers on one event source, you are using it 1:1. Supervision trees are usually built with supervisors being the direct child of other supervisors.

  2. Yes its too complicated, looks a bit like you would do it in OO languages with less expressive power. And just to prepare for a maybe distribution its not necessary. Refactoring in a functional language like Erlang is much easier than you probably think. So start simple and split functionality if you see the need.

3+4. See my altogether different suggestion.

  1. Its not ver OO like. Behaviours in OTP are only callback modules hiding the process mechanics in a generic module.

Even with the simple structure I suggested there is plenty of flexibility (brought to you by Erlang) because if you want to have multiple schedulers you can just use rpc to call the supervisor. You can use pool to automatically load balance distribution of tasks. And the dispatcher part can be easily separated from the scheduler (both under the toplevel supervisor then) the you can have more common state separated from the scheduler.