1
votes

I'm trying implement a simple supervisor and just have it restart child processes if they fail. But, I don't even know how to spawn more than one process under a supervisor! I looked at simple supervisor code on this site and found something

-module(echo_sup).
-behaviour(supervisor).
-export([start_link/0]).
-export([init/1]).

start_link() ->
    {ok, Pid} = supervisor:start_link(echo_sup, []),
    unlink(Pid).
init(_Args) ->
    {ok,  {{one_for_one, 5, 60},
       [{echo_server, {echo_server, start_link, []},
         permanent, brutal_kill, worker, [echo_server]},

        {echo_server2, {echo_server2, start_link, []},
         permanent, brutal_kill, worker, [echo_server2]}]}}.

I assumed that putting "echo_server2" part in the init() function would spawn another process under this supervisor, but I end up getting an exception exit:shutdown message.

Both the files "echo_server" and "echo_server2" are the same code but different names. So I'm just confused now.

-module(echo_server2).
-behaviour(gen_server).

-export([start_link/0]).
-export([echo/1, crash/0]).
-export([init/1, handle_call/3, handle_cast/2]).

start_link() ->
    {ok,Pid} = gen_server:start_link({local, echo_server2}, echo_server2, [], []),
    unlink(Pid).

%% public api
echo(Text) ->
    gen_server:call(echo_server2, {echo, Text}).
crash() ->
    gen_server:call(echo_server2, crash).

%% behaviours
init(_Args) ->
    {ok, none}.
handle_call(crash, _From, State) ->
    X=1,
    {reply, X=2, State};
handle_call({echo, Text}, _From, State) ->
    {reply, Text, State}.
handle_cast(_, State) ->
    {noreply, State}.
2

2 Answers

4
votes

First you need read some docs about OTP/gen_server and OTP/supervisors. You have few errors in your code.

1) In echo_sup module change your start_link function as follow:

start_link() ->
    supervisor:start_link({local, ?MODULE}, ?MODULE, []).

Dont know why do you unlink/1 after process has been started.

2) In both echo_servers change start_link function to:

start_link() -> 
    gen_server:start_link({local, ?MODULE}, ?MODULE, [], []).

You should not to change return value of this function, because supervisor expect one of this values:

{ok,Pid} | ignore | {error,Error}
2
votes

You don't need two different modules just to run two instances of the same server. The conflict problem is due to the tag in the child specification which has to be unique. It is the first element in the tuple. So you could have something like:

    [{echo_server, {echo_server, start_link, []},
      permanent, brutal_kill, worker, [echo_server]},
     {echo_server2, {echo_server, start_link, []},
      permanent, brutal_kill, worker, [echo_server]}]}}.

Why do you unlink the child processes? The supervisor uses these links to supervise its children. The error you are getting is that the supervisor expects the functions which start the children to return {ok,ChildPid}, this is how it gets the pid of the children, so when it gets another return value it fails the startup of the children and then gives up itself. All according to how it is supposed to work.

If you want to register both servers then you could modify the start_link function to take the name to use as an argument and pass so you can explicitly pass it in through the child spec. So:

start_link(Name) -> 
    gen_server:start_link({local, Name}, ?MODULE, [], []).

and

    [{echo_server, {echo_server, start_link, [echo_server]},
      permanent, brutal_kill, worker, [echo_server]},
     {echo_server2, {echo_server, start_link, [echo_server2]},
      permanent, brutal_kill, worker, [echo_server]}]}}.

Using the module name as the registered name for a server is just a convention which only works if you run one instance of the server.