1
votes

I recently learn about c++ six memory orders, I felt very confusing about memory_order_acquire and memory_order_release, here is an example from cpp:

#include <thread>
#include <atomic>
#include <cassert>
 
std::atomic<bool> x = {false};
std::atomic<bool> y = {false};
std::atomic<int> z = {0};
 
void write_x() { x.store(true, std::memory_order_seq_cst); }
void write_y() { y.store(true, std::memory_order_seq_cst); }
 
void read_x_then_y() {
     while (!x.load(std::memory_order_seq_cst));

     if (y.load(std::memory_order_seq_cst)) 
         ++z;
}
 
void read_y_then_x() {
     while (!y.load(std::memory_order_seq_cst));

     if (x.load(std::memory_order_seq_cst))
        ++z;
}
 
int main() {
    std::thread a(write_x);
    std::thread b(write_y);
    std::thread c(read_x_then_y);
    std::thread d(read_y_then_x);

    a.join(); b.join(); c.join(); d.join();

    assert(z.load() != 0);  // will never happen
}

In the cpp reference page, it says:

This example demonstrates a situation where sequential ordering is necessary.

Any other ordering may trigger the assert because it would be possible for the threads c and d to observe changes to the atomics x and y in opposite order.

So my question is why memory_order_acquire and memory_order_release can not be used here? And what semantics does memory_order_acquire and memory_order_release provide?

some references: https://en.cppreference.com/w/cpp/atomic/memory_order https://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync

2
Note that StoreLoad reordering without seq_cst is an easier case to demonstrate than IRIW; even x86 with its strongly-ordered memory model can demonstrate StoreLoad in real life (preshing.com/20120515/memory-reordering-caught-in-the-act). In practice it's still rarely necessary.Peter Cordes

2 Answers

1
votes

Sequential consistency provides a single total order of all sequentially consistent operations. So if you have a sequentially consistent store in thread A, and a sequentially consistent load in thread B, and the store is ordered before the load (in said single total order), then B observes the value stored by A. So basically sequential consistency guarantees that the store is "immediately visible" to other threads. A release store does not provide this guarantee.

As Peter Cordes pointed out correctly, the term "immediately visible" is rather imprecise. The "visibility" stems from the fact that all seq-cst operations are totally ordered, and all threads observe that order. Since the store and the load are totally ordered, the value of a store becomes visible before a subsequent load (in the single total order) is executed.

There exists no such total order between acquire/release operations in different threads, so there is not visibility guarantee. The operations are only ordered once an acquire-operations observes the value from a release-operation, but there is no guarantee when the value of the release-operation becomes visible to the thread performing the acquire-operation.

Let's consider what would happen if we were to use acquire/release in this example:

void write_x() { x.store(true, std::memory_order_release); }
void write_y() { y.store(true, std::memory_order_release); }
 
void read_x_then_y() {
     while (!x.load(std::memory_order_acquire));

     if (y.load(std::memory_order_acquire)) 
         ++z;
}
 
void read_y_then_x() {
     while (!y.load(std::memory_order_acquire));

     if (x.load(std::memory_order_acquire))
        ++z;
}
 
int main() {
    std::thread a(write_x);
    std::thread b(write_y);
    std::thread c(read_x_then_y);
    std::thread d(read_y_then_x);

    a.join(); b.join(); c.join(); d.join();

    assert(z.load() != 0);  // can actually happen!!
}

Since we have no guarantee about visibility, it could happen that thread c observes x == true and y == false, while at the same time thread d could observe y == true and x == false. So neither thread would increment z and the assertion would fire.

For more details about the C++ memory model I can recommend this paper which I have co-authored: Memory Models for C/C++ Programmers

0
votes

You can use aquire/release when passing information from one thread to another - this is the most common situation. No need for sequential requirements on this one.

In this example there are a bunch of threads. Two threads make write operation while third roughly tests whether x was ready before y and fourth tests whether y was ready before x. Theoretically one thread may observe that x was modified before y while another sees that y was modified before x. Not entirely sure how likely it is. This is an uncommon usecase.

Edit: you can visualize the example: assume that each threads is run on a different PC and they communicate via a network. Each pair of PCs has a different ping to each other. Here it is easy to make an example where it is unclear which event occurred first x or y as each PC will see the events occur in different order.

I am not sure on sure on which architectures this effect may occur but there are complex ones where two different processors are conjoined. Surely communication between the processors is slower than between cores of each processor.