
To keep things simple and in order to concentrate on the core of my problem, let's assume that a memory location, addressed locally by a pointer variable ptr, is shared among several processes. I in particular use MPI shared memory windows in C/++ to allocate and share the memory. To be concrete, let's say ptr references a floating point variable, so locally we have

float* ptr;

Now assume that all processes attempt to write the same value const float f to ptr, i.e.

*ptr = f;

My question is: Does this operation require synchronization, or can it be executed concurrently, given the fact that all processes attempt to modify the bytes in the same way, i.e. given the fact that f has the same value for every process. My question therefore boils down to: For concurrent write operations to e.g. floating point variables, is there the possibility that the race condition results in an inconsistent byte pattern, although every process attempts to modify the memory in the same way. I.e. if I know for sure that every process writes the same data, can I then omit synchronization?


2 Answers


Yes, you must synchronize the shared memory. the fact that the modifying threads reside in different processes has no meaning, it is still data race (writing to a shared memory from different threads).

do note that there are other problems that synchronization objects solve, like visibility and memory reordering, what is written to the shared memory is irrelevant.

currently, the standard does not define the idea of a process (only thread), and does not provide any means of synchronizing between processes easily.

you allocate a std::mutex in a shared memory and use that as you synchronization primitive, or rely on a win32 inter-process synchronization primitives like a mutex, semaphore or event.

alternatively, if you only want to synchronize a primitive, you can allocate a std::atomic<T> on a shared memory and use that as your synchronized primitive.


In C++, if multiple processes write to the same memory location without proper use of synchronization primitives or atomic operations, undefined behavior occurs. (That is, it might work, it might not work, the computer might catch on fire.)

In practice, on your computer, it's basically certain to work the way you think it should work. It actually is plausible that on some architectures things don't go the way you expect, though: If the CPU cannot read/write a block of memory as small as your shared value, or if the storage of the shared value crosses an alignment boundary, such a write can actually involve a read as well, and that read-modify-write can have the effect of reverting or corrupting other changes to memory.

The easiest way to get what you want is simply to do the write as a "relaxed" atomic operation:

std::atomic_store_explicit(ptr, f, std::memory_order_relaxed);

That ensures that the write is "atomic" in the sense of not causing a data race, and won't incur any overhead except on architectures where there would be potential problems with *ptr = f.