0
votes

I am implementing MPI non-blocking communication inside my program. I see on MPI_Isend man_page, it says:

A nonblocking send call indicates that the system may start copying data out of the send buffer. The sender should not modify any part of the send buffer after a nonblocking send operation is called, until the send completes.

My code works like this:

// send messages
if(s > 0){

    MPI_Requests s_requests[s];
    MPI_Status   s_status[s];

    for(int i = 0; i < s; ++i){

        // some code to form the message to send
        std::vector<doubel> send_info;

        // non-blocking send
        MPI_Isend(&send_info[0], ..., s_requests[i]);
    }

    MPI_Waitall(s, s_requests, s_status);
}

// recv info
if(n > 0){    // s and n will match

    for(int i = 0; i < n; ++i){

        MPI_Status status;

        // allocate the space to recv info
        std::vector<double> recv_info;

        MPI_Recv(&recv_info[0], ..., status)
    }

}


My question is: am I modify the send buffers since they are in the inner curly brackets (the send_info vector get killed after the loop finishes)? Therefore, this is not a safe communication mode? Although my program works fine now, I still being suspected. Thank you for your reply.

1
It means don't modify it from a another thread while you're inside the send method. You aren't doing that. - user207421
your buffer is/might be allocated on the stack, and hence can be overwritten before it is sent. That looks like a wrong usage of MPI_Isend() to me. - Gilles Gouaillardet
Yeah. I use std::vector so it is allocated on stack. So the right way is to put MPI_Wait() and MPI_Isend() in the same loop? - Shiqi
yes, that is one option, but likely equivalent to a blocking MPI_Send(). Other options include allocating a giant buffer before the for loop, and a single MPI_Waitall() after. An other common technique is to use 2 buffers: isend(buffer0);isend(buffer1);wait(req0);isend(buffer(0);wait(req1); ... so you still have some room for overlap between computation and communication while keeping your memory usage reasonable. - Gilles Gouaillardet
Note you are unlikely to see any errors with short messages (sent in eager mode) but more likely to send incorrect data with large messages. - Gilles Gouaillardet

1 Answers

0
votes

There are two points I want to emphasize in this example.

The first one is the problem I questioned: send buffer gets modified before MPI_Waitall. The reason is what Gilles said. And the solution could be allocated a big buffer before the for loop, and use MPI_Waitall after the loop is finished or put MPI_Wait inside the loop. But the latter one is equivalent to use MPI_Send in the sense of performance.

However, I found if you simply transfer to blocking send and receive, a communication scheme like this could cause deadlock. It is similar to the classic deadlock:

if (rank == 0) {
      MPI_Send(..., 1, tag, MPI_COMM_WORLD);
      MPI_Recv(..., 1, tag, MPI_COMM_WORLD, &status);
 } else if (rank == 1) {
      MPI_Send(..., 0, tag, MPI_COMM_WORLD);
      MPI_Recv(..., 0, tag, MPI_COMM_WORLD, &status);
 }

And the explaination could be found here.

My program could cause a similar situation: all the processors called MPI_Send then it is a deadlock.

So my solution is to use a large buffer and stick to non-blocking communication scheme.

#include <vector>
#include <unordered_map>

// send messages
if(s > 0){

    MPI_Requests s_requests[s];
    MPI_Status   s_status[s];

    std::unordered_map<int, std::vector<double>> send_info;

    for(int i = 0; i < s; ++i){


        // some code to form the message to send
        send_info[i] = std::vector<double> ();

        // non-blocking send
        MPI_Isend(&send_info[i][0], ..., s_requests[i]);
    }

    MPI_Waitall(s, s_requests, s_status);
}

// recv info
if(n > 0){    // s and n will match

    for(int i = 0; i < n; ++i){

        MPI_Status status;

        // allocate the space to recv info
        std::vector<double> recv_info;

        MPI_Recv(&recv_info[0], ..., status)
    }

}