2
votes

It was my understanding that MPI communicators restrict the scope of communication, such that messages sent from one communicator should never be received in a different one.

However, the program inlined below appears to contradict this.

I understand that the MPI_Send call returns before a matching receive is posted because of the internal buffering it does under the hood (as opposed to MPI_Ssend). I also understand that MPI_Comm_free doesn't destroy the communicator right away, but merely marks it for deallocation and waits for any pending operations to finish. I suppose that my unmatched send operation will be forever pending, but then I wonder how come the same object (integer value) is reused for the second communicator!?

Is this normal behaviour, a bug in the MPI library implementation, or is it that my program is just incorrect?

Any suggestions are much appreciated!

LATER EDIT: posted follow-up question


#include "stdio.h"
#include "unistd.h"
#include "mpi.h"

int main(int argc, char* argv[]) {
    int  rank, size;
    MPI_Group group;
    MPI_Comm my_comm;

    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Comm_group(MPI_COMM_WORLD, &group);

    MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
    if (rank == 0) printf("created communicator %d\n", my_comm);

    if (rank == 1) {
        int msg = 123;
        MPI_Send(&msg, 1, MPI_INT, 0, 0, my_comm);
        printf("rank 1: message sent\n");
    }

    sleep(1);
    if (rank == 0) printf("freeing communicator %d\n", my_comm);
    MPI_Comm_free(&my_comm);

    sleep(2);

    MPI_Comm_create(MPI_COMM_WORLD, group, &my_comm);
    if (rank == 0) printf("created communicator %d\n", my_comm);

    if (rank == 0) {
        int msg;
        MPI_Recv(&msg, 1, MPI_INT, 1, 0, my_comm, MPI_STATUS_IGNORE);
        printf("rank 0: message received\n");
    }

    sleep(1);
    if (rank == 0) printf("freeing communicator %d\n", my_comm);
    MPI_Comm_free(&my_comm);

    MPI_Finalize();
    return 0;
}

outputs:

created communicator -2080374784
rank 1: message sent
freeing communicator -2080374784
created communicator -2080374784
rank 0: message received
freeing communicator -2080374784
1
Interesting - I note this behaves differently in OpenMPI. I'm not sure if this is a bug per se, or the result of undefined behaviour (e.g., it's not clear to me whether it's valid to have free'd the communicator with that Send still obviously in flight). The re-using of the integer representation for the communicator doesn't really mean much one way or another; that's just a handle to an opaque object. Re-use of that is analogous to (I expect, very analogous to) re-using a table entry that's recently been freed.Jonathan Dursi
Unsurprisingly, under IntelMPI and mvapich2, this program correctly hangs at rank 1's send when using Ssend or messages larger than the eager limit. Presumably MPICH2, defensibly, doesn't view the send on my_comm as "pending" when it's been sent eagerly so proceeds and frees my_comm (OMPI hangs at comm_free). But I don't know if the resulting unexpected result is (a) undefined behaviour due to invalid early freeing of the communicator in user code (eg, like modifying the send buffer before you know it's been received), (b) incorrect MPICH2 behaviour, or (c) genuine ambiguity in the standard.Jonathan Dursi
..my mistake; in OpenMPI it hangs on the receive.Jonathan Dursi
This is likely a bug in MPICH (on which Intel MPI is based). Can you report this example to [email protected]?kraffenetti
@kraffenetti - Thanks! I didn't know Intel MPI was based on MPICH; I'll try reporting it there.i.adri

1 Answers

0
votes

The number you're seeing is simply a handle for the communicator. It's safe to reuse the handle since you've freed it. As to why you're able to send the message, look at how you're creating the communicator. When you use MPI_Comm_group, you're getting a group containing the ranks associated with the specified communicator. In this case, you get all of the ranks, since you are getting the group for MPI_COMM_WORLD. Then, you are using MPI_Comm_create to create a communicator based on a group of ranks. You are using the same group you just got, which will contain all of the ranks. So your new communicator has all of the ranks from MPI_COMM_WORLD. If you want your communicator to only contain a subset of ranks, you'll need to use a different function (or multiple functions) to make the desired group(s). I'd recommend reading through Chapter 6 of the MPI Standard, it includes all of the functions you'll need. Pick what you need to build the communicator you want.