15
votes

I've got a workflow that involves waking up every 30 seconds or so and polling a database for updates, taking action on that, then going back to sleep. Setting aside that database polling doesn't scale and other similar concerns, what is the best way to structure this workflow using Supervisors, workers, Tasks, and so forth?

I'll lay out a few ideas I've had and my thoughts for/against. Please help me figure out the most Elixir-y approach. (I'm still very new to Elixir, btw.)

1. Infinite Loop Through Function Call

Just put a simple recursive loop in there, like so:

def do_work() do
  # Check database
  # Do something with result
  # Sleep for a while
  do_work()
end

I saw something similar when following along with a tutorial on building a web crawler.

One concern I have here is infinite stack depth due to recursion. Won't this eventually cause a stack overflow since we're recursing at the end of each loop? This structure is used in the standard Elixir guide for Tasks, so I'm probably wrong about the stack overflow problem.

Update - As mentioned in the answers, tail call recursion in Elixir means stack overflows are not a problem here. Loops that call themselves at the end are an accepted way to do infinite looping.

2. Use a Task, Restart Each Time

The basic idea here is to use a Task that runs once and then exits, but pair it with a Supervisor with a one-to-one restart strategy, so it gets restarted each time after it completes. The Task checks the database, sleeps, then exits. The Supervisor sees the exit and starts a new one.

This has the benefit of living inside a Supervisor, but it seems like an abuse of the Supervisor. It's being used for looping in addition to error trapping and restarting.

(Note: There's probably something else that can be done with Task.Supervisor, as opposed to the regular Supervisor and I'm just not understanding it.)

3. Task + Infinite Recursion Loop

Basically, combine 1 and 2 so it's a Task that uses an infinite recursion loop. Now it's managed by a Supervisor and will restart if crashed, but doesn't restart over and over as a normal part of the workflow. This is currently my favorite approach.

4. Other?

My concern is that there's some fundamental OTP structures that I'm missing. For instance, I am familiar with Agent and GenServer, but I just recently stumbled onto Task. Maybe there's some kind of Looper for exactly this case, or some use case of Task.Supervisor that covers it.

5

5 Answers

18
votes

I'm a little bit late here, but for those of you still searching the right way to do it, I think it is worth mentioning the GenServer documentation itself :

handle_info/2 can be used in many situations, such as handling monitor DOWN messages sent by Process.monitor/1. Another use case for handle_info/2 is to perform periodic work, with the help of Process.send_after/4:

defmodule MyApp.Periodically do
    use GenServer

    def start_link do
        GenServer.start_link(__MODULE__, %{})
    end

    def init(state) do
        schedule_work() # Schedule work to be performed on start
        {:ok, state}
    end

    def handle_info(:work, state) do
        # Do the desired work here
        schedule_work() # Reschedule once more
        {:noreply, state}
    end

    defp schedule_work() do
        Process.send_after(self(), :work, 2 * 60 * 60 * 1000) # In 2 hours
    end
end
13
votes

I've only recently started using OTP, but I think I may be able to give you a few pointers:

  1. That's the Elixir way of doing this, I took a quote from Programming Elixir by Dave Thomas as it explains better than I do:

    The recursive greet function might have worried you a little. Every time it receives a message, it ends up calling itself. In many languages, that adds a new frame to the stack. After a large number of messages, you might run out of memory. This doesn’t happen in Elixir, as it implements tail-call optimization. If the very last thing a function does is call itself, there’s no need to make the call. Instead, the runtime can simply jump back to the start of the function. If the recursive call has arguments, then these replace the original parameters as the loop occurs.

  2. Tasks (as in the Task module) are meant for a single task, short lived processes, so they may be what you want. Alternatively, why not have a process that is spawned (maybe at startup) to have that task and have it looping and accessing the DB every x time?
  3. and 4, maybe look into using a GenServer with the following architecture Supervisor -> GenServer -> Workers spawning when needed to the task (here you may just use spawn fn -> ... end, don't really need to worry about choosing Task or another module) and then exiting when finished.
3
votes

I think the generally accepted way to do what you're looking for is approach #1. Because Erlang and Elixir automatically optimize tail calls you don't need to worry about stack overflow.

3
votes

There's another way with Stream.cycle. Here's an example of while macro

defmodule Loop do

  defmacro while(expression, do: block) do
    quote do
      try do
        for _ <- Stream.cycle([:ok]) do
          if unquote(expression) do
            unquote(block)
          else
            throw :break
          end
        end
      catch
        :break -> :ok
      end
    end
  end
end
2
votes

I would use GenServer and in init function return

{:ok, <state>, <timeout_in_ milliseconds>}

Setting timeout causes that your handle_info function gets called when timeout is reached.

And I can make sure this process is running by adding it to the my main project's supervisor.

This is an example of how it can be used :

defmodule MyApp.PeriodicalTask do
  use GenServer

  @timeout 50_000 

  def start_link do
    GenServer.start_link(__MODULE__, [], name: __MODULE__)
  end

  def init(_) do
    {:ok, %{}, @timeout}
  end

  def handle_info(:timeout, _) do
    #do whatever I need to do
    {:noreply, %{}, @timeout}
  end
end