1
votes

Cowboy is webserver written in erlang. It spawns new process for each request and than using that process for subsequent requests if HTTP pipelining (sending multiple requests on same socket one after the other without waiting for the response and assuming that responses will be send back in same order as requests was sent) is used by client.

This is fine, but if you want to use that webserver for building realtime web app, it has one problem and that is when socket is closed for instance because of client network problems, the process representing that socket on the server is terminated. That means you can`t use that process for storing some session data (because in realtime web app you probably want to go behind the end of the http request (if long polling is used for instance) and have some state associated to the connected client and think about him as "he is online" even if the http request was ended.

In sock.js, it is solved by spawning one more process for each client (each session id).

So if you have 2000 clients using websockets, you will have around 4k processes (one process from cowboy that represents that socket and one more for keeping the session state alive for case that cowboy process will be terminated (for instance because of network problems).

THE QUESTION IS: i am relative new in erlang so i don`t know if it does make sense much in question of performance improvement, but i am thinking about rewriting that Cowboy webserver a bit so the process representing realtime connection will not ends until i want it (the process will be alive even when the underlying websocket socket will be terminated).

This will eliminate the needs to have one more session process for each client. So instead of 4000 processes you will have just 2000. Can it be huge performance booster in erlang?

2
Remember, an Erlang process is nothing like an OS process. Erlang processes are not expensive and are best thought of as tools for structuring computation. It's likely a Cowboy rewrite would not help performance. It might even hurt if your only goal is to reduce the number of processes.Corbin March
Why did you tag this question with Elixir?Onorio Catenacci
@OnorioCatenacci: well, they both run on erlang vm and the question is valid even for elixir, but if you want...Krab
You're misunderstanding me. I was asking why you tagged the question with Elixir because I thought there might be some additional elixir specific details you had forgotten to add.Onorio Catenacci
Perhaps you could try storing session data in ETS and see if that's good enough for your needs. I feel like that'd be a common approach.Michael Terry

2 Answers

1
votes

Erlang is pretty good with processes, but, too much of anything ain't good. Using processes as direct mappings to sessions is not a good idea. Why not do it logically ? I assume you can have some IN-MEMORY storage, say, ETS, or even mnesia.

If am using Web Sockets to communicate, each user is connected via one such process, however, you simply map a certain random unique Session Key to each individual Process, hence to each individual user.

-record(client,{web_sock_pid, session_key,username}).

If the process exits, and the client end has a way pf reconnecting, once it re-identifies itself as the same user, then , the session key still holds, but the pid of the attached process has changed. it does not matter.

If it is NOT web sockets, and it is just HTTP REST/JSON/JSONP/XML services , then it is even very easy. Use ETS tables in RAM. A new session is stored and the parameters defining that session are store in RAM, then for each request, the session key can come along plus other parameters. Message delivery is by comet or frequent checks by the client end.

1
votes

Sounds like you are doing some premature optimizations if you ask me.

Erlang processes are very inexpensive. You shouldn't really have to worry about spawning too manny processes.

Write it with two processes per websocket, then do some measurements to see where it is using the most memory and wasting the most cpu cycles.