I'm trying to parallelize my calculations with rpc:pmap
. But I'm bit confused with its performance.
Here is simple example:
-module(my_module).
-compile(export_all).
do_apply( X, F ) -> F( X ).
First of all - test on single node:
1> timer:tc( rpc, pmap, [{my_module, do_apply}, [fun(X) -> timer:sleep(10), X end], lists:seq(1,10000)] ).
{208198,
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24,25,26,27|...]}
After that I've connected second node (second erlang shell process in my OS):
([email protected])24> timer:tc( rpc, pmap, [{my_module, do_apply}, [fun(X) -> timer:sleep(10), X end], lists:seq(1,10000)] ).
{446284,
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24,25,26,27|...]}
Finally I've connected third node:
([email protected])26> timer:tc( rpc, pmap, [{my_module, do_apply}, [fun(X) -> timer:sleep(10), X end], lists:seq(1,10000)] ).
{483399,
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24,25,26,27|...]}
So - I've got worse performance with three nodes vs. single node.
I'm realize that there is some overhead for communication between nodes. But how can I understand in which cases is better to perform calculations on multiple nodes?
Edit:
My step-by-step test from shell:
1> c(my_module).
{ok,my_module}
2>
2> List = lists:seq(1,10000).
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24,25,26,27,28,29|...]
Test performance on single node:
3> timer:tc( rpc, pmap, [{my_module, do_apply}, [fun(X)-> timer:sleep(10), X end], List] ).
{207346,
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24,25,26,27|...]}
Entrance to network environment:
4> net_kernel:start([one]).
{ok,<0.20066.0>}
([email protected])5> erlang:set_cookie(node(), foobar).
true
Add second node:
([email protected])6> net_kernel:connect('[email protected]').
true
([email protected])7>
([email protected])7> nodes().
['[email protected]']
Test performance with two nodes:
([email protected])8> timer:tc( rpc, pmap, [{my_module, do_apply}, [fun(X)-> timer:sleep(10), X end], List] ).
{510733,
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24,25,26,27|...]}
Connect third node:
([email protected])9> net_kernel:connect('[email protected]').
true
([email protected])10> nodes().
['[email protected]',
'[email protected]']
Test performance with three nodes:
([email protected])11> timer:tc( rpc, pmap, [{my_module, do_apply}, [fun(X)-> timer:sleep(10), X end], List] ).
{496278,
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24,25,26,27|...]}
P.S. I guess that performance decreases because I'm creating each node as a new erlang-shell process in the same physical machine. But I don't know exactly if I'm right.