I have been running a program literally hundreds of times but recently found that one input parameter set causes the following error:
In DElambda at 116
In parallel_function>make_general_channel/channel_general at 879
In remoteParallelFunction at 31
??? Error using ==> parallel_function at 598
The session that parfor is using has shut down
Error in ==> CreateCurve at 86
parfor j=1:10
??? The client lost connection to an unknown lab.
This might be due to network problems, or the interactive matlabpool job might have errored. This is
causing: java.lang.OutOfMemoryError: GC overhead limit exceeded
It happens when I set the min and max values for the parameter search space to min[0;0] and max[1.5;1.5] and set the Population size to 10k (it's differential evolution). I have not touched the other parameters at any point. Whenever I try to run it with the above parameters I get the error above.
However, when I drop the population size to 1k it converges (to incorrect answer due to insufficient searches). Alternately when I use a population size of 10k with any other set of parameters that I have tried it has worked perfectly and converges to the correct solution?
Seems very odd?
I am currently re-running the problem parameter set using a for loop rather than the parfor loop (and matlabpool's switched off), to see if this runs any better. Unfortunately this is very time consuming so I won't know the results for a while.
In the mean time can anyone explain what is causing this error? And/or tell me how to debug parallel code?
Just to add the code ran fine with the rogue parameter set when I used for instead of parfor! So I really need to find someway of debugging in the parallel environment so that I can isolate and fix this bug. Using for rather than parfor is just too slow!
java.lang.OutOfMemoryError: GC overhead limit exceeded
says, it's an out of memory. Without seeing the code, it's hard to tell why those specific input cause the out-of-memory. – Oleg