The MPI standard 3.0 says in Section 5.13 that
Finally, in multithreaded implementations, one can have more than one, concurrently executing, collective communication call at a process. In these situations, it is the user’s re- sponsibility to ensure that the same communicator is not used concurrently by two different collective communication calls at the same process.
I wrote the following program which does NOT execute correctly (but compiles) and dumps a core
void main(int argc, char *argv[])
{
int required = MPI_THREAD_MULTIPLE, provided, rank, size, threadID, threadProcRank ;
MPI_Comm comm = MPI_COMM_WORLD ;
MPI_Init_thread(&argc, &argv, required, &provided);
MPI_Comm_size(comm, &size);
MPI_Comm_rank(comm, &rank);
int buffer1[10000] = {0} ;
int buffer2[10000] = {0} ;
#pragma omp parallel private(threadID,threadProcRank) shared(comm, buffer1)
{
threadID = omp_get_thread_num();
MPI_Comm_rank(comm, &threadProcRank);
printf("\nMy thread ID is %d and I am in process ranked %d", threadID, threadProcRank);
if(threadID == 0)
MPI_Bcast(buffer1, 10000, MPI_INTEGER, 0, comm);
If (threadID == 1)
MPI_Bcast(buffer1, 10000, MPI_INTEGER, 0, comm);
}
MPI_Finalize();
}
My question is: Two threads in each process having thread ID 0 and thread ID 1 post a broadcast call which can be taken as a MPI_Send() in the root process ( i.e. process 0). I am interpreting it as two loops of MPI_Send() where the destination is the remaining processes. The destination processes also post MPI_Bcast() in thread ID 0 and thread ID 1. These can be taken as two MPI_Recv()'s posted by each process in the two threads. Since the MPI_Bcast() are identical - there should be no matching problems in receiving the messages sent by Process 0 (the root). But still the program does not work. Why ? Is it because of the possibility that messages might get mixed up on different/same collectives on the same communicator ? And since MPI (mpich2) sees the possibility of this, it just does not allow two collectives on the same communicator pending at the same time ?