I have found an issue in my hybrid MPI/OpenMP code that is reproduced in the simplest form in the code cited below. I am using 2 threads per MPI rank. These two threads are then used in a OpenMP "Section" to do several computations, one of these consists on making a "mpi_allreduce" call on two different vectors A and B whose results are stored in W and WW. The problem is that every time I run the program I end up with a different output. My mind is that the MPI calls are overlapping and the reduced arrays W and WW are combined even when they have different names but I am not sure. Any comment on how to overcome this issue is welcome.
Details: The MPI thread level is initialized to MPI_THREAD_MULTIPLE in the code but I have tried also serial and funneled (with same issue).
I compile the code mpiifort -openmp allreduce_omp_mpi.f90 and for running I use:
export OMP_NUM_THREADS=2 mpirun -np 3 ./a.out
PROGRAM HELLO
use mpi
use omp_lib
IMPLICIT NONE
INTEGER nthreads, tid
Integer Provided,mpi_err,myid,nproc
CHARACTER(MPI_MAX_PROCESSOR_NAME):: hostname
INTEGER :: nhostchars
integer :: i
real*8 :: A(1000), B(1000), W(1000),WW(1000)
provided=0
!Initialize MPI context
call mpi_init_thread(MPI_THREAD_MULTIPLE,provided,mpi_err)
CALL mpi_comm_rank(mpi_comm_world,myid,mpi_err)
CALL mpi_comm_size(mpi_comm_world,nproc,mpi_err)
CALL mpi_get_processor_name(hostname,nhostchars,mpi_err)
!Initialize arrays
A=1.0
B=2.0
!Check if MPI_THREAD_MULTIPLE is available
if (provided >= MPI_THREAD_MULTIPLE) then
write(6,*) ' mpi_thread_multiple provided',myid
else
write(6,*) ' not mpi_thread_multiple provided',myid
endif
!$OMP PARALLEL PRIVATE(nthreads, tid) NUM_THREADS(2)
!$omp sections
!$omp section
call mpi_allreduce(A,W,1000,mpi_double_precision,mpi_sum,mpi_comm_world,mpi_err)
!$omp section
call mpi_allreduce(B,WW,1000,mpi_double_precision,mpi_sum,mpi_comm_world,mpi_err)
!$omp end sections
!$OMP END PARALLEL
write(6,*) 'W',(w(i),i=1,10)
write(6,*) 'WW',(ww(i),i=1,10)
CALL mpi_finalize(mpi_err)
END
mpi_
. Doing so might result in name clashes with the MPI library. Also, there is no guarantee thatDOUBLE_PRECISION
andREAL*8
are one and the same type. – Hristo Ilievwrite(6,
, butwrite(*,
. It is not any longer and more portable. – Vladimir F