Will two atomic writes to different locations in different threads always be seen in the same order by other threads?

Question

Similar to my previous question, consider this code

-- Initially --
std::atomic<int> x{0};
std::atomic<int> y{0};

-- Thread 1 --
x.store(1, std::memory_order_release);

-- Thread 2 --
y.store(2, std::memory_order_release);

-- Thread 3 --
int r1 = x.load(std::memory_order_acquire);   // x first
int r2 = y.load(std::memory_order_acquire);

-- Thread 4 --
int r3 = y.load(std::memory_order_acquire);   // y first
int r4 = x.load(std::memory_order_acquire);

Is the weird outcome r1==1, r2==0 and r3==2, r4==0 possible in this case under the C++11 memory model? What if I were to replace all std::memory_order_acq_rel by std::memory_order_relaxed?

On x86 such an outcome seems to be forbidden, see this SO question but I am asking about the C++11 memory-model in general.

Bonus question:

We all agree, that with std::memory_order_seq_cst the weird outcome would not be allowed in C++11. Now, Herb Sutter said in his famous atomic<>-weapons talk @ 42:30 that std::memory_order_seq_cst is just like std::memory_order_acq_rel but std::memory_order_acquire-loads may not move before std::memory_order_release-writes. I cannot see how this additional constraint in the above example would prevent the weird outcome. Can anyone explain?

Changing all std::memory_order_acq_rel won't make any difference if you don't have any std::memory_order_acq_rel in your code. Did you leave something relevant out of your question? — user743382
@hvd I mean std::memory_order_acq_rel to represent both the std::memory_order_acquire's and the std::memory_order_release's. Maybe I shall change this... — Toby Brull
The outcome is certainly allowed according to the C++ memory model. There's no ordering between threads 1 and 2. You can imagine the memory changes propagating differently fast to different cores. Synchronisation is only about what happens if you read the new value. There's no guarantee that you will read the new value. — Kerrek SB
@TobiasBrüll Surely that depends on what assembly winds up getting generated, which is certainly not guaranteed by any standard. — David Schwartz
I've swapped the read order around in thread 4, since your original question didn't make much sense: both threads were reading the x and y in the same order so they couldn't detect writes occurring in the opposite order: you need to swap the read order to do that. As the accepted answer points out, there is trivially a seq cst order that allows the values you put with the original form of the question. — BeeOnRope

MWid MWid · Accepted Answer · 2015-01-08T18:19:17

The updated¹ code in the question (with loads of x and y swapped in Thread 4) does actually test that all threads agree on a global store order.

Under the C++11 memory model, the outcome r1==1, r2==0, r3==2, r4==0 is allowed and in fact observable on POWER.

On x86 this outcome is not possible, because there "stores are seen in a consistent order by other processors". This outcome is also not allowed in a sequential consistent execution.

Footnote 1: The question originally had both readers read x then y. A sequentially consistent execution of that is:

-- Initially --
std::atomic<int> x{0};
std::atomic<int> y{0};

-- Thread 4 --
int r3 = x.load(std::memory_order_acquire);

-- Thread 1 --
x.store(1, std::memory_order_release);

-- Thread 3 --
int r1 = x.load(std::memory_order_acquire);
int r2 = y.load(std::memory_order_acquire);

-- Thread 2 --
y.store(2, std::memory_order_release);

-- Thread 4 --
int r4 = y.load(std::memory_order_acquire);

This results in r1==1, r2==0, r3==0, r4==2. Hence, this is not a weird outcome at all.

To be able to say that each reader saw a different store order, we need them to read in opposite orders to rule out the last store simply being delayed.

Will two atomic writes to different locations in different threads always be seen in the same order by other threads?

4 Answers