Does a process switch affect std::atomic compare and exchange in arm9 processor?

Question

I am new to std::atomic in c++ and trying to understand the implementation of compare and exchange operations under ARM processors.I am using gcc on linux.

When i look into the assembly code

    mcr p15, 0, r0, c7, c10, 5
.L41:
    ldrexb  r3, [r2]
    cmp r3, r1
    bne .L42
    strexb  ip, r0, [r2]
    cmp ip, #0
    bne .L41
.L42:
    mcr p15, 0, r0, c7, c10, 5

My understanding is

it takes multiple instructions to do compare and exchange.
ldrex marks the memory location as exclusive and reads the data.
strex stores the data and clears the exclusive flag for that location.

My questions are

does ldrex mark the Virtual addr. as exclusive or the physical address?
If Process P1 marks the virtual address as exclusive and a process switch occurs to P2, will that virtual addr. be accessible in P2? what will happen if P2 also execute an ldrex on the same address.
If Process P1 marks the physical address as exclusive and a process switch occurs, when P1 resumes isn't there a chance that the data now resides in a different location in physical memory due to paging.

I am trying to understand this because, i want to do a compare and exchange on a shared memory location accessed by multiple processes.

My c++ function looks like

std::atomic<bool> *flag;  
flag = (std::atomic<bool> *) (shm_ptr);  
bool temp = false ;  
while(!std::atomic_compare_exchange_strong((flag),&temp,true))  
{  
std::this_thread::yield();  
}  
// update shared memory  
std::atomic_store((flag), false);

Should std::atomic_store((flag), false); be atomic? The flag is 'true' after lock is taken. It is not needed to be released atomically, due to other threads never could execute std::atomic_compare_exchange_strong() successfully until 'false' value is written back. — user3124812
@user3124812 std::atomic_store((flag), false) ensures that the stored value is seen by all cores. With a non atomic store, the stored value can be cached and written to the memory at later point and will not be visible to other cores. — aravind b

Peter Cordes Peter Cordes · Accepted Answer · 2018-09-26T08:41:21

Yes, it's safe to use lock-free std::atomic<T> on shared memory mapped by different processes, on all mainstream C++ implementations for ARM.

But non-lock-free atomics won't work, because different processes won't share the same table of locks.

An interrupt before the strex completes will cause it to fail. You don't have to worry about kernel code changing the page tables between ldrex and strex.

Resuming this code in the middle after an interrupt on the same or another CPU will mean the strex simply fails, because it's not executing as part of a "transaction" started by ldrex.

Atomicity is address-free on ARM, and on every normal mainstream system that implements C++11 lock-free atomics.

Everything still works if two threads / processes on different cores have the same physical page mapped to different virtual addresses. The C++11 standard explicitly recommends that implementations work this way for lock-free std::atomic<T>. (It stops short of requiring it, because then it would have to define what a process is, and functions for remapping virtual memory.)

This is nearly a duplicate of Are lock-free atomics address-free in practice?. See that for quotes from the standard and more details.

Modern computer systems ensure that their caches don't have aliasing homonym / synonym problems, because that would cause coherency problems in general, not just for atomic RMWs. Sometimes this requires cooperation from the OS kernel (e.g. page coloring if one cache index bit comes from the page number instead of just the offset-within-a-page part of the address), but in general caches behave as physical.

(Some early CPUs, like early MIPS, did sometimes use virtually-addressed L1 data caches, but that's not done on systems that can support multiple CPUs, AFAIK.)

Does a process switch affect std::atomic compare and exchange in arm9 processor?

1 Answers