20
votes

As known, std::atomic and volatile are different things.

There are 2 main differences:

  1. Two optimizations can be for std::atomic<int> a;, but can't be for volatile int a;:

    • fused operations: a = 1; a = 2; can be replaced by the compiler on a = 2;
    • constant propagation: a = 1; local = a; can be replaced by the compiler ona = 1; local = 1;
  2. Reordering of ordinary reads/writes across atomic/volatile operations:

    • for volatile int a; any volatile-read/write-operations can't be reordered. But nearby ordinary reads/writes can still be reordered around volatile reads/writes.
    • for std::atomic a; reordering of nearby ordinary reads/writes restricted based on the memory barrier used for atomic operation a.load(std::memory_order_...);

I.e. volatile don't introduce a memory fences, but std::atomic can do it.

As is well described in the article:

For example, std::atomic should be used for concurrent multi-thread programs (CPU-Core <-> CPU-Core), but volatile should be used for access to Mamory Mapped Regions on devices (CPU-Core <-> Device).


But if required, both have unusual semantics and has any or all of the atomicity and/or ordering guarantees needed for lock-free coding, i.e. if required volatile std::atomic<>, require for several reasons:

  • ordering: to prevent reordering of ordinary reads/writes, for example, for reads from CPU-RAM, to which the data been written using the Device DMA-controller

For example:

char cpu_ram_data_written_by_device[1024];
device_dma_will_write_here( cpu_ram_data_written_by_device );

// physically mapped to device register
volatile bool *device_ready = get_pointer_device_ready_flag();

//... somewhere much later
while(!device_ready); // spin-lock (here should be memory fence!!!)
for(auto &i : cpu_ram_data_written_by_device) std::cout << i;

example:

char cpu_ram_data_will_read_by_device[1024];
device_dma_will_read_it( cpu_ram_data_written_by_device );

// physically mapped to device register
volatile bool *data_ready = get_pointer_data_ready_flag();

//... somewhere much later
for(auto &i : cpu_ram_data_will_read_by_device) i = 10;
data_ready=true; //spilling cpu_ram_data_will_read_by_device to RAM, should be memory fence
  • atomic: to guarantee that the volatile operation will be atomic - i.e. It will consist of a single operation instead of multiple - i.e. one 8-byte-operation instead of two 4-byte-operations

For this, Herb Sutter said about volatile atomic<T>, January 08, 2009: http://www.drdobbs.com/parallel/volatile-vs-volatile/212701484?pgno=2

Finally, to express a variable that both has unusual semantics and has any or all of the atomicity and/or ordering guarantees needed for lock-free coding, only the ISO C++0x draft Standard provides a direct way to spell it: volatile atomic.

But do modern standards C++11 (not C++0x draft), C++14, and C++17 guarantee that volatile atomic<T> has both semantics (volatile + atomic)?

Does volatile atomic<T> guarantee the most stringent guarantees from both volatile and atomic?

  1. As in volatile: Avoids fused-operations and constant-propagation as described in the beginning of the question
  2. As in std::atomic: Introduces memory fences to provide ordering, spilling, and being atomic.

And can we do reinterpret_cast from volatile int *ptr; to volatile std::atomic<int>*?

2
Let me throw in a short comment. volatile atomic<T> over atomic<volatile T> and why would you want to do the reinterpret_cast? It will probably work, but not guaranteed. - DeiDei
You can't have std::atomic<volatile T> because a volatile type is not trivially copyable. - Brian Bi
@Brian Yes you are right. Removed about std::atomic<volatile T>. - Alex
@DeiDei If driver-API returns volatile int *ptr; and I want to use code while(ptr->load(std::memory_order_acquire) == 0); instead of while(*ptr == 0); std::atomic_thread_fence(std::memory_order_acquire); - Alex
" one 8-byte-operation instead of two 4-byte-operations" - atomic doesn't guarantee that. It could very well take a lock and then do two 4-byte writes. ATOMIC_LONG_LOCK_FREE could be 0 to say "never lock-free". - Bo Persson

2 Answers

5
votes

Yes, it does.

Section 29.6.5, "Requirements for operations on atomic types"

Many operations are volatile-qualified. The “volatile as device register” semantics have not changed in the standard. This qualification means that volatility is preserved when applying these operations to volatile objects.

I checked working drafts 2008 through 2016, and the same text is in all of them. Therefore it should apply C++11, C++14, and C++17.

1
votes

And can we do reinterpret_cast from volatile int *ptr; to volatile std::atomic<int>*?

You can do such casts if and only if the ABI says that both types (here int and std::atomic<int>) have the same representation and restrictions: same size, alignement and possible bit patterns; same meaning for same bit patterns.

Everything that is volatile is directly connected with the ABI: variables that are volatile qualified must have the canonical ABI representation at sequence points and operations on volatile objects only assume they follow their ABI requirements and nothing else. So whenever volatile is used in C or C++, you can rely alternatively on the language standard or the platform ABI.

(I hope this answer is not deleted because some people despise volatile semantic and depending on the ABI and platform specific notions.)