1
votes

The cppreference.com gives the following example for use of std::memory_order_relaxed. (https://en.cppreference.com/w/cpp/atomic/memory_order)

#include <vector>
#include <iostream>
#include <thread>
#include <atomic>
 
std::atomic<int> cnt = {0};
 
void f()
{
    for (int n = 0; n < 1000; ++n) {
        cnt.fetch_add(1, std::memory_order_relaxed);
    }
}
 
int main()
{
    std::vector<std::thread> v;
    for (int n = 0; n < 10; ++n) {
        v.emplace_back(f);
    }
    for (auto& t : v) {
        t.join();
    }
    std::cout << "Final counter value is " << cnt << '\n';
}

Output: Final counter value is 10000

Is this a correct/sound example (Can a standard complaint compiler introduce optimizations that will yield different answers?). Since std::memory_order_relaxed only guarantee the operation be atomic, one thread may not see an update from another thread. Am I missing something?

2
memory barriers are used to see non atomic data - for example, if the atomic variable is a pointer or an index of an array - than the store and load must have read/write barriers to synchronize non atomic data. atomics are always atomic and visible.David Haim
So yeah, you missed what memory barriers are - means to synchronize non atomic data by "piggybacking" atomics, which are always thread-safe.David Haim

2 Answers

0
votes

Yes, this is a correct example - so no, a compiler cannot introduce optimizations that would yield a different result. You are right that in general a thread is not guaranteed to see an update from another thread (or more specific, there is no guarantee when such an update becomes visible). However, in this case cnt is updated using an atomic read-modify-write operation, and the standard states in [atomics.order]:

Atomic read-modify-write operations shall always read the last value (in the modification order) written before the write associated with the read-modify-write operation.

And this absolutely makes sense if you think about it, because otherwise it would not be possible to make a read-modify-write operation atomic. Suppose fetch_add would not see the latest update, but some older value. That would mean that the operation would increment that old value and store it. But that would imply that 1) the values returned by fetch_add are not strictly increasing (some threads would see the same value) and 2) that some updates are missed.

0
votes

The hint as to why this works can be found in the first sentence of the description on the page you linked (emphasis mine):

std::memory_order specifies how memory accesses, including regular, non-atomic memory accesses, are to be ordered around an atomic operation.

Notice how this talks not about the memory access on the atomic itself, but rather on the memory accesses surrounding the atomic. Concurrent accesses to a single atomic always have strict ordering requirements, otherwise it would be impossible to reason about their behavior in the first place.

In case of the counter, you get the guarantee that fetch_add will behave pretty much as expected: The counter gets increased one at a time, no values are skipped and no values will be counted twice. You can easily verify this by inspecting the return values of the individual fetch_add calls. You get those guarantees always, regardless of the memory ordering.

Things get interesting as soon as you assign meaning to those counter values in the context of the surrounding program logic. For instance, you could use a certain counter value to indicate that a particular piece of data has been made available by an earlier computation step. This will require memory orderings, if that relationship between the counter and the data needs to persist across threads: With the relaxed ordering, at the point where you observe the counter value you are waiting for, you have no guarantee that the data you are waiting for is ready as well. Even if the counter is set after the data has been written by the producing thread, this ordering of memory operations does not translate across thread boundaries. You will need to specify a memory order that orders the write to the data with respect to the change of the counter across threads. The crucial thing to understand here is that while the operations are guaranteed to happen in a certain order within one thread, that ordering is no longer guaranteed when observing the same data from a different thread.

So the rule of thumb is: If you're only manipulating an atomic in isolation, you don't need any ordering. As soon as that manipulation is interpreted in the context of other unrelated memory accesses (even if those accesses are themselves atomics!) you need to worry about using the correct ordering.

The usual advice applies that, unless you have really, really, really good reasons for doing so, you should just stick with the default memory_order_seq_cst. As an application developer you don't want to mess with memory orderings unless you have strong empirical evidence that it is worth the trouble you will undoubtedly run into.