We are using OpenMP (libgomp) in order to speed up some calculations in a multithreaded Qt application. The parallel OpenMP sections are located within two different threads, though in fact they never execute in parallel. What we observe in this case is that 2N (where N = OMP_THREAD_LIMIT
) omp threads are launched, apparently interfering each with the other. The calculation time is very high, while the processor load is low. Setting OMP_WAIT_POLICY
hardly has any effect.
We also tried moving all the omp sections to a single thread (this is not a good solution for us, though, from an architectural point of view). In this case, the overall calculation time does drop and the processor is fully loaded, but only if OMP_WAIT_POLICY
is set to ACTIVE
. When OMP_WAIT_POLICY == PASSIVE
, the calculation time remains low and the processor is idle 50% of time.
Odd enough, when we use omp within a single thread, the first loop parallelized using omp (in a series of omp calulations) executes 10 times slower compared to the multithread case.
Upd: Our questions are:
a) is there any way to reuse the openmp threads when using omp in the context of different threads.
b) Why executing with OMP_WAIT_POLICY == PASSIVE
slows everything. Does it take so long to wake the threads?
c) Is there any logical explanation for the phenomenon of the first parallel block being so slow (even when waiting in active mode)
Upd2: Please mind that the issue is probably related to GNU OMP implementation. icc doesn't have it.