0
votes

Hi currently I'm working on a program that I have working in parallel using MPI. I was wondering if I could gain additional speed in the for loops using OpenMP so I could get more out of each processor. Would I gain anything out of doing this? Also how would I go about it?

1
The combination of MPI+OpenMP is widely used. Your favourite search engine is a much better place to start than SO, though you'll find plenty of questions on the topic here.High Performance Mark
There is no generic answer to this question; some MPI application benefit from threading, some do not. For instance, for our HPC application (lsu3shell.sourceforge.net), threading brought significant improvement, but not in terms of lower running times. We just saved a lot of memory by sharing data structures among threads (that were redundant in separate address spaces of different MPI processes running on same nodes), which allowed us to solve larger computational problems.Daniel Langr

1 Answers

0
votes

From experience it really depend on your problem and on how many MPI processes you are using.

Using large amount of MPI processes usually improve data locality, but your parallelization might not allow large amount of processes.

The thought that you will gain for sure a decent speedup is very often wrong :-(... But then if you reach the point where you cant use more MPI processes due to lack of parallel efficiency you will probably gain the possibility of using more cores efficiently.

From experience you should target a small number of thread (4-8, 1/2 of the socket cores count), especially if you have only small loops (which should be the case if you reach the max number of MPI processes).

A good intro of hybrid parallelism: http://www.openmp.org/press-release/sc13-tutorial-hybrid-mpi-openmp-parallel-programming/