Multithreaded MKL + OpenMP compiled with GCC

Question

My understanding, from reading the Intel MKL documentation and posts such as this-- Calling multithreaded MKL in from openmp parallel region -- is that building OpenMP parallelization into your own code AND MKL internal OpenMP for MKL functions such as DGESVD or DPOTRF is impossible unless building with the Intel compiler. For example, I have a large linear system I'd like to solve using MKL, but I'd also like to take advantage of parallelization to build the system matrix (my own code independent of MKL), in the same binary executable.

Intel states in the MKL documentation that 3rd party compilers "may have to disable multithreading" for MKL functions. So the options are:

openmp parallelization of your own code (standard #pragma omp ... etc) and single-thread calls to MKL
multi-thread calls to MKL functions ONLY, and single-threaded code everywhere else
use the Intel compiler (I would like to use gcc, so not an option for me)
parallelize both your code and MKL with Intel TBB? (not sure if this would work)

Of course, MKL ships with it's own openmp build libiomp*, which gcc can link against. Is it possible to use this library to achieve parallelization of your own code in addition to MKL functions? I assume some direct management of threads would be involved. However as far as I can tell there are no iomp dev headers included with MKL, which may answer that question (--> NO).

So it seems at this point like the only answer is Intel TBB (Thread Building Blocks). Just wondering if I'm missing something or if there's a clever workaround.

(Edit:) Another solution might be if MKL has an interface to accept custom C++11 lambda functions or other arbitrary code (e.g., containing nested for loops) for parallelization via whatever internal threading scheme is being used. So far I haven't seen anything like this.

I highly suggest you simply test this out on a simple example. I have had no problems running OpenMP and MKL together when compiling with gcc (on Linux). In my case it even worked with nested calls to parallel MKL inside OpenMP parallel regions, which is what I imagine that section was referring to (as I see no good reason OpenMP couldn't work elsewhere). — Qubit
Both the Intel and LLVM OpenMP runtime libraries (which are effectively the same) provide the interfaces used by GCC compiled OPenMP code. Therefore you can mix your own GCC compiled OpenMP code with MKL in its OpenMP mode provided that you ensure that the Intel (or LLVM) OpenMP runtime is used. That may require smart use of LD_LIBRARY_PATH and renaming of that runtime (so that it appears a libgomp) or use of LD_PRELOAD... — Jim Cownie

Anton Anton · Accepted Answer · 2019-02-15T00:32:50

Intel TBB will also enable better nested parallelism, which might help in some cases. If you want to enable GNU OpenMP with MKL, there are following options:

Dynamically Selecting the Interface and Threading Layer. Links against mkl_rt library and then
- set env var MKL_THREADING_LAYER=GNU prior to loading MKL
- or call mkl_set_threading_layer(MKL_THREADING_GNU);
Linking with Threading Libraries directly (though, the link has no mentioning of GNU OpenMP explicitly). This is not recommended when you are building a library, a plug-in, or an extension module (e.g. Python's package), which can be mixed with other components that might use MKL differently. Link against mkl_gnu_thread.

Multithreaded MKL + OpenMP compiled with GCC

1 Answers