4
votes

What is the difference in logic and performance between x86-instructions LOCK XCHG and MOV+MFENCE for doing a sequential-consistency store.

(We ignore the load result of the XCHG; compilers other than gcc use it for the store + memory barrier effect.)

Is it true, that for sequential consistency, during the execution of an atomic operation: LOCK XCHG locks only a single cache-line, and vice versa MOV+MFENCE locks whole cache-L3(LLC)?

1
Apples and oranges, MFENCE doesn't provide atomicity. - Hans Passant
@Hans Passant I didn't say that MFENCE provide atomicity, because MOV already atomic - this we can see in C11(atomic)/C++11(std::atomic) for all ordering in x86 except SC(sequential consistency): en.cppreference.com/w/cpp/atomic/memory_order But i said that MFENCE provide sequential consistency for atomic variables as we can see in C11(atomic)/C++11(std::atomic) in GCC4.8.2: stackoverflow.com/questions/19047327/… - Alex
(I'm not even sure if mov is atomic for unaligned access, by the way.) - Kerrek SB
@Kerrek SB MOV+MFENCE(SC in GCC4.8.2) we can replace on LOCK XCHG for SC as we can see in video where on 0:28:20 said that MFENCE more expensive that XCHG: channel9.msdn.com/Shows/Going+Deep/… - Alex

1 Answers

-1
votes

The difference is in purpose of usage.

MFENCE (or SFENCE or LFENCE) is useful when we are locking a part of memory region accessible from two or more threads. When we atomically set the lock for this memory region we can after that use all non-atomic instruction, because there are faster. But we must call SFANCE (or MFENCE) one instruction before unlocking the memory region to ensure that locked memory is visible correctly to all other threads.

If we are changing only a single memory aligned variable, then we are using atomic instructions like LOCK XCHG so no lock of memory region is needed.