1
votes

This is a follow-up to this previous question of mine, for which the conclusion was that the program was erroneous, and therefore the expected behavior was undefined.

What I'm trying to create here is a simple error-handling mechanism, for which I use that Irecv request for the empty message as an "abort handle", attaching it to my normal MPI_Wait call (and turning it into MPI_WaitAny), in order to allow me to unblock process 1 in case an error occurs on process 0 and it can no longer reach the point where it's supposed to post the matching MPI_Recv.

What's happening is that, due to internal message buffering, the MPI_Isend may succeed right away, without the other process being able to post the matching MPI_Recv. So there's no way of canceling it anymore.

I was hoping that once all processes call MPI_Comm_free I can just forget about that message once and for all, but, as it turns out, that's not the case. Instead, it's being delivered to the MPI_Recv in the following communicator.

So my questions are:

  1. Is this also an erroneous program, or is it a bug in the MPI implementation (Intel MPI 4.0.3)?
  2. If I turn my MPI_Isend calls into MPI_Issend, the program works as expected - can I at least in that case rest assured that the program is correct?
  3. Am I reinventing the wheel here? Is there a simpler way to achieve this?

Again, any feedback is much appreciated!


#include "stdio.h"
#include "unistd.h"
#include "mpi.h"
#include "time.h"
#include "stdlib.h"

int main(int argc, char* argv[]) {
    int rank, size;
    MPI_Group group;
    MPI_Comm my_comm;

    srand(time(NULL));
    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_group(MPI_COMM_WORLD, &group);

    MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
    if (rank == 0) printf("created communicator %d\n", my_comm);

    if (rank == 1) {
        MPI_Request req[2];
        int msg = 123, which;

        MPI_Isend(&msg, 1, MPI_INT, 0, 0, my_comm, &req[0]);
        MPI_Irecv(NULL, 0, MPI_INT, 0, 0, my_comm, &req[1]);

        MPI_Waitany(2, req, &which, MPI_STATUS_IGNORE);

        MPI_Barrier(my_comm);

        if (which == 0) {
            printf("rank 1: send succeed; cancelling abort handle\n");
            MPI_Cancel(&req[1]);
            MPI_Wait(&req[1], MPI_STATUS_IGNORE);
        } else {
            printf("rank 1: send aborted; cancelling send request\n");
            MPI_Cancel(&req[0]);
            MPI_Wait(&req[0], MPI_STATUS_IGNORE);
        }
    } else {
        MPI_Request req;
        int msg, r = rand() % 2;
        if (r) {
            printf("rank 0: receiving message\n");
            MPI_Recv(&msg, 1, MPI_INT, 1, 0, my_comm, MPI_STATUS_IGNORE);
        } else {
            printf("rank 0: sending abort message\n");
            MPI_Isend(NULL, 0, MPI_INT, 1, 0, my_comm, &req);
        }

        MPI_Barrier(my_comm);

        if (!r) {
            MPI_Cancel(&req);
            MPI_Wait(&req, MPI_STATUS_IGNORE);
        }
    }

    if (rank == 0) printf("freeing communicator %d\n", my_comm);
    MPI_Comm_free(&my_comm);

    sleep(2);

    MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
    if (rank == 0) printf("created communicator %d\n", my_comm);

    if (rank == 0) {
        MPI_Request req;
        MPI_Status status;
        int msg, cancelled;

        MPI_Irecv(&msg, 1, MPI_INT, 1, 0, my_comm, &req);
        sleep(1);

        MPI_Cancel(&req);
        MPI_Wait(&req, &status);
        MPI_Test_cancelled(&status, &cancelled);

        if (cancelled) {
            printf("rank 0: receive cancelled\n");
        } else {
            printf("rank 0: OLD MESSAGE RECEIVED!!!\n");
        }
    }

    if (rank == 0) printf("freeing communicator %d\n", my_comm);
    MPI_Comm_free(&my_comm);

    MPI_Finalize();
    return 0;
}

outputs:

created communicator -2080374784
rank 0: sending abort message
rank 1: send succeed; cancelling abort handle
freeing communicator -2080374784
created communicator -2080374784
rank 0: STRAY MESSAGE RECEIVED!!!
freeing communicator -2080374784
2
The MPI standard says that when an MPI_Isend is canceled, it must either complete normally or not be received at all on the receiving side. Since your message is apparently small enough to be delivered immediately, the MPI_Cancel is ignored by the implementation. That is why you are seeing it have no effect.kraffenetti
I assume you're referring to the last MPI_Cancel call. It would have been totally fine to receive that message if it was in the same communicator, but, as you can see, that's a different one.i.adri
You program is still erroneous if it has unmatched sends/recvs.kraffenetti

2 Answers

2
votes

As mentioned in one of the above comments by @kraffenetti, this is an erroneous program because the sent messages are not being matched by receives. Even though the messages are cancelled, they still need to have a matching receive on the remote side because it's possible that the cancel might not be successful for sent messages due to the fact that they were already sent before the cancel can be completed (which is the case here).

This question started a thread on this on a ticket for MPICH, which you can find here that has more details.

0
votes

I tried to build your code using open mpi and it did not work. mpicc complained about status.cancelled

  error: ‘MPI_Status’ has no member named ‘cancelled’

I suppose this is a feature of intel mpi. What happens if you switch for :

    ...
    int flag;
    MPI_Test_cancelled(&status, &flag);
    if (flag) {
    ...

This gives the expected output using open mpi (and it makes your code less dependant). Is it the case using intel mpi ?

We need an expert to tell us what is status.cancelled in intel mpi, because i don't know anything about it !

Edit : i tested my answer many times and i found that the output was random, sometimes correct, sometimes not. Sorry for that... As if something in status was not set. Part of the answer may be in MPI_Wait(), http://www.mpich.org/static/docs/v3.1/www3/MPI_Wait.html ,

" The MPI_ERROR field of the status return is only set if the return from the MPI routine is MPI_ERR_IN_STATUS. That error class is only returned by the routines that take an array of status arguments (MPI_Testall, MPI_Testsome, MPI_Waitall, and MPI_Waitsome). In all other cases, the value of the MPI_ERROR field in the status is unchanged. See section 3.2.5 in the MPI-1.1 specification for the exact text. " If MPI_Test_cancelled() makes use of the MPI_ERROR, things might get bad.

So here is the trick : use MPI_Waitall(1,&req, &status) ! The output is correct at last !