I'm fitting a models to several datasets with fminsearch, and I'm trying to do them in parallel. My code is running up to the start of the parfor loop, but the parfor loop seems to take forever to start! (The first line in the parfor is not executed). There are no errors, matlab just remains "busy".
I'm running on local cluster with 4 cores, started with matlabpool 4, which appears to start up fine. I'm running Matlab R2014b 64bit on Ubuntu 14.04.3, eight core i7-3770K @ 3.50GHz, 24GiB RAM (most is unused of course).
EDIT
Here is code that reproduces the problem!
file matlab_parfor_test_2
function f=matlab_parfor_test_2
f={};
for i=1:400
a=@(p)i*p; % make some functions depending on i
b=@(p)a(p)+0; % a function depending on this
f=[f { @(p)b(p) }]; % create a list of i functions using this
end
file matlab_parfor_test_1
function matlab_parfor_test_1
f=matlab_parfor_test_2(); % create the functions
f=f(1:2); % discard all but two functions
for i=1:2 % for each function ('A')
parfor j=1 % dummy parfor
tmp=f{i}; % just read a function from the cell ('B')
end
end
The time taken to get from 'A' to the first 'B' (ie. time taken to "enter" the parfor) on my machine is
returning 400 functions: 20 sec
500 functions: 32 sec
600 functions: 45 sec
700 functions: 64 sec
This is very odd, because in test_1 I discard all but 2 of those functions! Why should the discarded functions cause slowing?
I thought perhaps matlab is not actually deleting the unwanted functions in f. So I tried replacing f=f(1:2) with
f={f{1}, f{2}};
but this also did not help.
If I replace the parfor with for, then of course it takes under 1ms to execute.
Any ideas??
OLD VERSION OF QUESTION
function fit_all
models = createModelFunctions(); % creates cell of function handles
data = { [1 2 3], [1 2 3] }; % create 2 data sets
for i = 1:length(models)
fprintf('model %g\n',i);
parfor j = 1:length(data)
fprintf('data %g\n',j);
tmp = models{i}; % if I comment this line out, it runs fine!
% p(j) = fminsearch(@(p)models{j}(p,data{j}), [0 0]);
end
end
the model functions are created in another file,
function models = createModelFunctions()
models{1} = @(p,d) likelihoodfun(0,0,p,d);
models{2} = @(p,d) likelihoodfun(1,0,p,d);
function L = likelihoodfun(a,b,p,d)
L = some maths here;
Running fit_all, I expected to see a list of model 1, data 1, data 2, model 2 etc.. The output I'm getting is
model 1
then the thing just stops: no prompt, matlab says "busy", UI and OS are responsive as usual. System monitor shows only 1 core is active. It never makes it into the parfor.
If I press ctrl+C at this point, after a 3-minute delay I get
Operation terminated by user during parallel.internal.pool.serialize (line 21)
In distcomp.remoteparfor (line 69)
serializedInitData = parallel.internal.pool.serialize(varargin);
In parallel_function>iMakeRemoteParfor (line 1060)
P = distcomp.remoteparfor(pool, W, @make_channel, parfor_C);
In parallel_function (line 444)
[P, W] = iMakeRemoteParfor(pool, W, parfor_C);
If I comment out the line indicated, it works -- so the problem seems to be when I access the model functions... Similarly, it works fine if I change models to
models={@sum,@sum}
i.e. it's just when I'm using function handles from another file...
L=0) and it works fine. - Itamar Katz