Can we run OpenMP parallelize code on several nodes?

Question

Assuming that we have four 16-core nodes (node1, node2, node3, node4). How can I run a big parallelized program on node1,2,3 at the same time? Or even using 16 cores in all, however allocated as 7cores in node1 + 8cores in node2 + 1core in node3 (other part being occupied)?
Is MPI the common way? Does openmp solely suffice? I haven't learnt MPI, but have used openmp within single node.

It is certainly not the only way. For example, Erlang, Oz and many other languages have their own way of job distribution, and PVM is another library that is commonly compared to MPI. But if you're in C or Fortran, MPI is a pretty good choice. — Amadan

Sid Sid · Accepted Answer · 2015-01-08T07:28:01

You can use a combination of both OpenMP and MPI if required. While MPI does utilize every core on each node and has been optimized to use locality of reference when it finds out that its other tasks are on the same machine the code base needs to change a lot in case it has already been developed. Incrementally parallellizing your code is recommended using OpenMP and so you might want to orchestrate a hybrid where each task of MPI utilizes the cores using OpenMP. So number of nodes = number of MPI tasks number of cores per machines = number of OpenMP tasks per machine

Can we run OpenMP parallelize code on several nodes?

1 Answers