0
votes

In my application I use MPI to distribute jobs in a master-slave fashion. A job is given to a slave and then results are collected. In a multi-threaded version of this program, there was a potential deadlock when all processors simultaneously try to Send (blocking), because there was no matching Recv. I came with up with a solution which seems to work, but I would like to have the guarantee (other than by testing it another ten thousand times).

The safety of my program is certain, if this little code is guaranteed to be -- provided a conforming implementation. (obviously this works for two processors only and is not intended for more):

#include <cassert>
#include "mpi.h"

int main()
{
   MPI::Init();
   int ns[] = {-1, -1};
   int rank = MPI::COMM_WORLD.Get_rank();
   ns[rank] = rank;
   MPI::Request request = MPI::COMM_WORLD.Isend(&rank, sizeof(int), MPI::BYTE, 1 - rank, 0);
   MPI::COMM_WORLD.Recv(&ns[1 - rank], sizeof(int), MPI::BYTE, 1 - rank, 0);
   request.Wait();
   assert( ns[0] == 0 );
   assert( ns[1] == 1 );
   MPI::Finalize();
}

So my question is: Is the interleaving of an Isend with a Recv until I call Wait on the Request returned by Isend a well-defined safe thing in MPI?

(Disclaimer: This piece of code is not designed to be exception-safe or particularly beautiful. It's for demo purposes only)

2

2 Answers

1
votes

Your code is perfectly safe. This is guaranteed by the semantics of the non-blocking operations as defined in the MPI standard ยง3.7.4 - Semantics of Nonblocking Communications:

Progress A call to MPI_WAIT that completes a receive will eventually terminate and return if a matching send has been started, unless the send is satised by another receive. In particular, if the matching send is nonblocking, then the receive should complete even if no call is executed by the sender to complete the send. Similarly, a call to MPI_WAIT that completes a send will eventually return if a matching receive has been started, unless the receive is satised by another send, and even if no call is executed to complete the receive.

A blocking operation in that context is equivalent to initiating a non-blocking one, immediately followed by a wait.

If the words of the standard are not reassuring enough, then this code section from the implementation of MPI_SENDRECV in Open MPI might help:

if (source != MPI_PROC_NULL) { /* post recv */
    rc = MCA_PML_CALL(irecv(recvbuf, recvcount, recvtype,
                            source, recvtag, comm, &req));
    OMPI_ERRHANDLER_CHECK(rc, comm, rc, FUNC_NAME);
}

if (dest != MPI_PROC_NULL) { /* send */
    rc = MCA_PML_CALL(send(sendbuf, sendcount, sendtype, dest,
                           sendtag, MCA_PML_BASE_SEND_STANDARD, comm));
    OMPI_ERRHANDLER_CHECK(rc, comm, rc, FUNC_NAME);
}

if (source != MPI_PROC_NULL) { /* wait for recv */
    rc = ompi_request_wait(&req, status);
} else {
    if (MPI_STATUS_IGNORE != status) {
        *status = ompi_request_empty.req_status;
    }
    rc = MPI_SUCCESS;
}

It doesn't matter if you use Irecv / Send / Wait(receive) or Isend / Recv / Wait(send) - both are equally safe when it comes to possible deadlocks. Of course deadlocks could (and will) occur if the interleaved operation is not properly matched.

The only thing that brings your code to non-conformance is the fact that it uses the C++ MPI bindings. Those were deprecated in MPI-2.2 and deleted in MPI-3.0. You should use the C API instead.

-1
votes

This code is not guaranteed to work. MPI implementations are free to not do anything related to your Isend until you call the corresponding Wait function.

The better option is to make both your Send and Recv functions nonblocking. You would use Isend and Irecv and instead of using Wait, you would use Waitall. The Waitall function takes an array of MPI requests and waits on all of them (while making progress on each) simultaneously.

So your program would look like this:

#include <cassert>
#include "mpi.h"

int main()
{
   MPI::Init();
   int ns[] = {-1, -1};
   int rank = MPI::COMM_WORLD.Get_rank();
   MPI::Request requests[2];
   ns[rank] = rank;
   requests[0] = MPI::COMM_WORLD.Isend(&rank, sizeof(int), MPI::BYTE, 1 - rank, 0);
   requests[1] = MPI::COMM_WORLD.Irecv(&ns[1 - rank], sizeof(int), MPI::BYTE, 1 - rank, 0);
   MPI::Request::Waitall(2, requests)
   assert( ns[0] == 0 );
   assert( ns[1] == 1 );
   MPI::Finalize();
}