7
votes

I'm trying to understand these sections under the heading Release-Acquire ordering https://en.cppreference.com/w/cpp/atomic/memory_order

They say regarding atomic load and stores:

If an atomic store in thread A is tagged memory_order_release and an atomic load in thread B from the same variable is tagged memory_order_acquire, all memory writes (non-atomic and relaxed atomic) that happened-before the atomic store from the point of view of thread A, become visible side-effects in thread B. That is, once the atomic load is completed, thread B is guaranteed to see everything thread A wrote to memory.

Then regarding mutexes:

Mutual exclusion locks, such as std::mutex or atomic spinlock, are an example of release-acquire synchronization: when the lock is released by thread A and acquired by thread B, everything that took place in the critical section (before the release) in the context of thread A has to be visible to thread B (after the acquire) which is executing the same critical section.

The first paragraph seems to say that an atomic load and store (with memory_order_release, memory_order_acquire) thread B is guaranteed to see everything thread A wrote. including non-atomic writes.

The second paragraph seems to suggest that a mutex works the same way, except the scope of what is visible to B is limited to whatever was wrapped in the critical section, is that an accurate interpretation? or would every write, even those before the critical section be visible to B?

3
Congratulations, you have found your way to the darkest corners of c++11 ! I recommend reading kernel.org/doc/Documentation/memory-barriers.txt (didn't finish it myself though)Arne J
While I am curious about how this is handled at the OS and CPU level, I think the whole point of the C++ memory model is that we shouldn't have to understand those underlying implementations in order to write software that is correct. Understanding those details should only really be necessary when implementing optimizations. I'm trying to get a better grasp of this at the C++ level before I dig any deeper.Lockyer
@Lockyer Not only that, but an advance compiler could compile a MT program in a much more subtle way than just emitting fences while avoiding the obviously redundant ones as current compilers do.curiousguy

3 Answers

4
votes

I think the reason the cppreference quote about mutexes is written that way is due to the fact that if you're using mutexes for synchronization, all shared variables used for communication should always be accessed inside the critical section.

The 2017 standard says in 4.7.1:

a call that acquires a mutex will perform an acquire operation on the locations comprising the mutex. Correspondingly, a call that releases the same mutex will perform a release operation on those same locations. Informally, performing a release operation on A forces prior side effects on other memory locations to become visible to other threads that later perform a consume or an acquire operation on A.

Update: I want to make sure I have a solid post because it is surprisingly hard to find this information on the web. Thanks to @Davis Herring for pointing me in the right direction.

The standard says

in 33.4.3.2.11 and 33.4.3.2.25:

mutex unlock synchronizes with subsequent lock operations that obtain ownership on the same object

(https://en.cppreference.com/w/cpp/thread/mutex/lock, https://en.cppreference.com/w/cpp/thread/mutex/unlock)

in 4.6.16:

Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

https://en.cppreference.com/w/cpp/language/eval_order

in 4.7.1.9:

An evaluation A inter-thread happens before evaluation B if

4.7.1.9.1) -- A synchronizes-with B, or

4.7.1.9.2) -- A is dependency-ordered before B, or

4.7.1.9.3) -- for some evaluation X

4.7.1.9.3.1) ------ A synchronizes with X and X is sequenced before B, or

4.7.1.9.3.2) ------ A is sequenced before X and X inter-thread happens before B, or

4.7.1.9.3.3) ------ A inter-thread happens before X and X inter-thread happens before B.

https://en.cppreference.com/w/cpp/atomic/memory_order

  • So a mutex unlock B inter-thread happens before a subsequent lock C by 4.7.1.9.1.
  • Any evaluation A that happens in program order before the mutex unlock B also inter-thread happens before C by 4.7.1.9.3.2
  • Therefore after an unlock() guarantees that all previous writes, even those outside the critical section, must be visible to a matching lock().

This conclusion is consistent with the way mutexes are implemented today (and were in the past) in that all program-order previous loads and stores are completed before unlocking. (More accurately, the stores have to be visible before the unlock is visible when observed by a matching lock operation in any thread.) There's no question that this is the accepted definition of release in theory and in practice.

1
votes

There’s no magic here: the mutex section is merely describing the common case, where (because every visit to the critical section might write the shared data) the writer in question protects all its access with the mutex. (Other, earlier writes are visible and might be relevant: consider creating and initializing an object without synchronization and then storing its address in a shared variable in the critical section.)

1
votes

The first paragraph seems to say that an atomic load and store (with memory_order_release, memory_order_acquire) thread B is guaranteed to see everything thread A wrote. including non-atomic writes.

Not just writes, all memory operations are done; you can see that reads are accomplished too: although of course a read doesn't produce a side effect, you can see that reads before the release never see a value written after the acquire.

All of https://en.cppreference.com/ insists on writes (easy to explain) and completely ignore the issue of reads being accomplished.

The second paragraph seems to suggest that a mutex works the same way, except the scope of what is visible to B is limited to whatever was wrapped in the critical section, is that an accurate interpretation? or would every write, even those before the critical section be visible to B?

But "in the critical section" isn't even a thing. Nothing you do can be separated from the memory state in which it's done. When you set an integer object "in the critical section", the object has to exist; it doesn't make sense to take "write to an object" is isolation as there would be no object to talk about. Interpreted strictly, "the critical section" would cover only object created inside it. But then none of these objects would be known by other threads so there would be nothing to protect.

So the result of "critical section" is by essence the whole history of the program, with some accesses to shared objects starting only after the mutex lock.