Safety guarantee for asynchronous sends in MPI

Question

In my application I use MPI to distribute jobs in a master-slave fashion. A job is given to a slave and then results are collected. In a multi-threaded version of this program, there was a potential deadlock when all processors simultaneously try to Send (blocking), because there was no matching Recv. I came with up with a solution which seems to work, but I would like to have the guarantee (other than by testing it another ten thousand times).

The safety of my program is certain, if this little code is guaranteed to be -- provided a conforming implementation. (obviously this works for two processors only and is not intended for more):

#include <cassert>
#include "mpi.h"

int main()
{
   MPI::Init();
   int ns[] = {-1, -1};
   int rank = MPI::COMM_WORLD.Get_rank();
   ns[rank] = rank;
   MPI::Request request = MPI::COMM_WORLD.Isend(&rank, sizeof(int), MPI::BYTE, 1 - rank, 0);
   MPI::COMM_WORLD.Recv(&ns[1 - rank], sizeof(int), MPI::BYTE, 1 - rank, 0);
   request.Wait();
   assert( ns[0] == 0 );
   assert( ns[1] == 1 );
   MPI::Finalize();
}

So my question is: Is the interleaving of an Isend with a Recv until I call Wait on the Request returned by Isend a well-defined safe thing in MPI?

(Disclaimer: This piece of code is not designed to be exception-safe or particularly beautiful. It's for demo purposes only)

Hristo Iliev Hristo Iliev · Accepted Answer · 2014-03-04T15:38:31

Your code is perfectly safe. This is guaranteed by the semantics of the non-blocking operations as defined in the MPI standard §3.7.4 - Semantics of Nonblocking Communications:

Progress A call to MPI_WAIT that completes a receive will eventually terminate and return if a matching send has been started, unless the send is satised by another receive. In particular, if the matching send is nonblocking, then the receive should complete even if no call is executed by the sender to complete the send. Similarly, a call to MPI_WAIT that completes a send will eventually return if a matching receive has been started, unless the receive is satised by another send, and even if no call is executed to complete the receive.

A blocking operation in that context is equivalent to initiating a non-blocking one, immediately followed by a wait.

If the words of the standard are not reassuring enough, then this code section from the implementation of MPI_SENDRECV in Open MPI might help:

if (source != MPI_PROC_NULL) { /* post recv */
    rc = MCA_PML_CALL(irecv(recvbuf, recvcount, recvtype,
                            source, recvtag, comm, &req));
    OMPI_ERRHANDLER_CHECK(rc, comm, rc, FUNC_NAME);
}

if (dest != MPI_PROC_NULL) { /* send */
    rc = MCA_PML_CALL(send(sendbuf, sendcount, sendtype, dest,
                           sendtag, MCA_PML_BASE_SEND_STANDARD, comm));
    OMPI_ERRHANDLER_CHECK(rc, comm, rc, FUNC_NAME);
}

if (source != MPI_PROC_NULL) { /* wait for recv */
    rc = ompi_request_wait(&req, status);
} else {
    if (MPI_STATUS_IGNORE != status) {
        *status = ompi_request_empty.req_status;
    }
    rc = MPI_SUCCESS;
}

It doesn't matter if you use Irecv / Send / Wait(receive) or Isend / Recv / Wait(send) - both are equally safe when it comes to possible deadlocks. Of course deadlocks could (and will) occur if the interleaved operation is not properly matched.

The only thing that brings your code to non-conformance is the fact that it uses the C++ MPI bindings. Those were deprecated in MPI-2.2 and deleted in MPI-3.0. You should use the C API instead.

Safety guarantee for asynchronous sends in MPI

2 Answers