14
votes

There is a pattern I've seen occasionally where the init/1 function of a gen_server process will send a message to itself signalling that it should be initialized. The purpose of this is for the gen_server process to initialize itself asynchronously so that the process spawning it doesn't have to wait. Here is an example:

-module(test).
-compile(export_all).

init([]) ->
    gen_server:cast(self(), init),
    {ok, {}}.

handle_cast(init, {}) ->
    io:format("initializing~n"),
    {noreply, lists:sum(lists:seq(1,10000000))};
handle_cast(m, X) when is_integer(X) ->
    io:format("got m. X: ~p~n", [X]),
    {noreply, X}.

b() ->
    receive P -> {} end,
    gen_server:cast(P, m),
    b().

test() ->
    B = spawn(fun test:b/0),
    {ok, A} = gen_server:start_link(test,[],[]),
    B ! A.

The process assumes that the init message will be received before any other message - otherwise it will crash. Is it possible for this process to get the m message before the init message?


Let's assume there's no process sending messages to random pids generated by list_to_pid, since any application doing this will probably not work at all, regardless of the answer to this question.

5
See also this question.legoscia
I can't tell if any of these answers are correct because they all seem to make the assumption that this question is true: stackoverflow.com/q/18018780/2213023Dog

5 Answers

5
votes

Theoretical answer to the question is it possible for a process to get a message before the init message? is YES. But practically (when no process is doing list_to_pid and sending a message) to this process the answer is NO provided the gen_server is not a registered process.

This is because the return of gen_server:start_link ensures that callback init of gen_server is executed. Thus initialize message is the first message in the process message queue before any other process gets the Pid to send a message. Thus your process is safe and does not receive any other message before init.

But same does not go true for registered process as there can be a process which might be sending message to the gen_server using registered name even before it completes callback init function. Lets consider this test function.

test() ->
    Times = lists:seq(1,1000),
    spawn(gen_server, start_link,[{local, ?MODULE}, ?MODULE, [], []]),
    [gen_server:cast(?MODULE, No) || No <-Times].

The sample output is

1> async_init:test().
Received:356
Received:357
[ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,
 ok,ok,ok,ok,ok,ok,ok,ok,ok,ok|...]
Received:358
Received:359
2> Received:360
2> Received:361
...
2> Received:384
2> Received:385
2> Initializing
2> Received:386
2> Received:387
2> Received:388
2> Received:389 
...

You can see that gen_server received messages from 356 to 385 messages before initialize. Thus the async callback does not work in registered name scenario.

This can be solved by two ways

1.Register the process after Pid is returned.

 start_link_reg() ->
      {ok, Pid} = gen_server:start(?MODULE, [], []),
      register(?MODULE, Pid).

2.Or in handle_cast for init message register the process.

handle_cast(init, State) ->
    register(?MODULE, self()),
    io:format("Initializing~n"),
    {noreply, State};

The sample output after this change is

1> async_init:test().
Initializing
Received:918
Received:919
[ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,ok,
 ok,ok,ok,ok,ok,ok,ok,ok,ok,ok|...]
Received:920
2> Received:921
2> Received:922
...

Thus sending a message to itself for initializing does not ensure that it is the first message that it receives but with bit of changes in the code (and design) can ensure that it is the first to be executed.

2
votes

In this particular case, you would be safe in the assumption that the 'init' message will be received before 'm'. In general (and especially if you register your process) this is not true though.

If you want to be 100% safe knowing that your init code will run first you can do something like:

start_link(Args...) ->
    gen_server:start_link(test, [self(), Args...], []).

init([Parent, Args...]) ->
    do_your_synchronous_start_stuff_here,
    proc_lib:init_ack(Parent, {ok, self()}),
    do_your_async_initializing_here,
    io:format("initializing~n"),
    {ok, State}.

I didn't test this, so I don't know if the "bonus" init_ack will print an ugly message to the terminal or not. If it does, the code has to be expanded slightly, but the general idea still stands. Let me know and I'll update my answer.

1
votes

Your sample code is safe and m is always received after init.

However, from a theoretical point of view, if init/1 handler of a gen_server sends a message to itself, using gen_server:cast/2 or the send primitive, it is not guaranteed to be the first message.

There is no way to guarantee this simply because init/1 is executed within the process of the gen_server, therefore after the process was created and allocated a pid and a mailbox. In non-SMP mode, the scheduler can schedule out the process under some load before the init function is called or before the message is sent, since calling a function (such as gen_server:cast/2 or the init handler for that matter) generates a reduction and the BEAM emulator tests whether it's time to give some time to other processes. In SMP mode, you can have another scheduler that will run some code sending a message to your process.

What distinguishes theory from practice is the way to find out about the existence of the process (in order to send it a message before the init message). Code could use links from the supervisor, the registered name, the list of processes returned by erlang:processes() or even call list_to_pid/1 with random values or unserialize pids with binary_to_term/1. Your node might even get a message from another node with a serialized pid, especially considering that creation number wraps around after 3 (see your other question Wrong process getting killed on other node?).

This is unlikely in practice. As a result, from a practical point of view, every time this pattern is used, the code can be designed to ensure init message is received first and the server is initialized before it receives other messages.

If the gen_server is a registered process, you would start it from a supervisor and ensure that all clients are started afterward in the supervision tree or introduce some kind of (probably inferior) synchronization mechanism. This is required even if you do not use this pattern of asynchronous initialization (otherwise clients could not reach the server). Of course, you might still have issues in case of crashes and restarts of this gen_server, but this is true whatever the scenario, and you can only be saved by a carefully crafted supervision tree.

If the gen_server is not registered or referred to by name, clients will eventually pass the pid to gen_server:call/2,3 or gen_server:cast/2 which they would obtain through the supervisor which calls gen_server:start_link/3. gen_server:start_link/3 only returns when init/1 returned and therefore after the init message was enqueued. This is exactly what your code above does.

0
votes

gen_server uses proc_lib:init_ack to make sure that the process is properly started before returning the pid from start_link. So the message sent in init will be the first message.

0
votes

This is not 100% safe! In gen.erl line 117-129, we can see this:

init_it(GenMod, Starter, Parent, Mod, Args, Options) ->
init_it2(GenMod, Starter, Parent, self(), Mod, Args, Options).

init_it(GenMod, Starter, Parent, Name, Mod, Args, Options) ->
    case name_register(Name) of
        true ->
            init_it2(GenMod, Starter, Parent, Name, Mod, Args, Options);
        {false, Pid} ->
            proc_lib:init_ack(Starter, {error, {already_started, Pid}})
    end.

init_it2(GenMod, Starter, Parent, Name, Mod, Args, Options) ->
    GenMod:init_it(Starter, Parent, Name, Mod, Args, Options).

In init_it/7 the process register its Name first, and then in init_it2/7 it calls GenMod:init_it/6 in which it calls your init/1 function.

Although, before gen_server:start_link returns, it is hardly to guess the new process id. However, if you send a message to the server with the registered Name, and the message arrives before your gen_server:cast is called, your code will be wrong.

Daniel's solution may be right, but I'm not quite sure whether two proc_lib:init_ack will cause an error or not. However, the parent would never like to receive an unexpected message. >_<

Here is another solution. Keep a flag in your gen_servser state to mark whether the server is initialized. And when you receive m, just check whether the server is initialized, if not, gen_cast m to yourself.

This is a little troublesome solution, but I'm sure it is right. =_=

I'm a freshman here, how I wish I could add a comment. >"<