Synchronizing access to MPI3 shared memory : is this code guaranteed to work by MPI standards?

Question

The MPI-3 standard introduces shared-memory, that can be read and written by all processes sharing this memory without using calls to the MPI library. While there are examples of one-sided communications using shared or non-shared memory, I did not find much information about how to use shared memory correctly with direct access.

I ended up doing something like this, which works well, but I was wondering if the MPI standard guarantees that it will always work?

// initialization:
MPI_Comm comm_shared;
MPI_Comm_split_type(MPI_COMM_WORLD, MPI_COMM_TYPE_SHARED, i_mpi, MPI_INFO_NULL, &comm_shared);

// allocation
const int N_WIN=10;
const int mem_size = 1000*1000;
double* mem[10];
MPI_Win win[N_WIN];
for (int i=0; i<N_WIN; i++) {   // I need several buffers.
    MPI_Win_allocate_shared( mem_size, sizeof(double), MPI_INFO_NULL, comm_shared, &mem[i], &win[i] );
    MPI_Win_lock_all(0, win);
}

while(1) {
    MPI_Barrier(comm_shared);
    ... // write anywhere on shared memory
    MPI_Barrier(comm_shared);
    ... // read on shared memory written by other processes
}

// deallocation
for (int i=0; i<N_WIN; i++) {
    MPI_Win_unlock_all(win[i]);
    MPI_Win_free(&win[i]);
}

Here, I ensure synchronization by using MPI_Barrier() and assume the hardware makes the memory view consistent. Furthermore, because I have several shared windows, a single call to MPI_Barrier seems more efficient than calling MPI_Win_fence() on every shared memory window.

It seems to work well an my x86 laptops and servers. But is this programm a valid/correct MPI program? Is there a more efficient method of achieving the same thing?

Jeff Hammond Jeff Hammond · Accepted Answer · 2020-03-06T18:41:19

There are two key issues here:

MPI_Barrier is absolutely not a memory barrier and should never be used that way. It may synchronize memory as a side-effect of its implementation in most cases, but users can never assume that. MPI_Barrier is only guaranteed to synchronize process execution. (If it helps, you can imagine a system where MPI_Barrier is implemented using a hardware widget that does not more than the MPI standard requires. IBM Blue Gene sort of did this in some cases.)
This question is unanswerable without details on what you are actually doing with shared-memory here:

while(1) {
    MPI_Barrier(comm_shared);
    ... // write anywhere on shared memory
    MPI_Barrier(comm_shared);
    ... // read on shared memory written by other processes
}

It may not be written clearly, but it was assumed by the authors of the relevant text of the MPI-3 standard - I was part of this group - that one could reason about shared-memory using the memory model of the underlying/host language. Thus, if you are writing this code in C11, you can reason about it according to the C11 memory model.

If you want to use MPI to synchronize shared memory, then you should use MPI_Win_sync on all the windows for load-store accesses and MPI_Win_flush for RMA operations (Put/Get/Accumulate/Get_accumulate/Fetch_and_op/Compare_and_swap).

I expect MPI_Win_sync to be implemented as a CPU memory barrier, so it is redundant to call it for every window. This is why it may be more effective to assume C11 or C++11 memory models and use https://en.cppreference.com/w/c/atomic/atomic_thread_fence and https://en.cppreference.com/w/cpp/atomic/atomic_thread_fence, respectively.

Synchronizing access to MPI3 shared memory : is this code guaranteed to work by MPI standards?

2 Answers