1
votes

I was looking into the side effects/run time overhead of using compiler barrier ( in gcc ) in x86 env.

Compiler barrier: asm volatile( ::: "memory" )

GCC documentation tells something interesting ( https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html )

Excerpt:

The "memory" clobber tells the compiler that the assembly code performs memory reads or writes to items other than those listed in the input and output operands (for example, accessing the memory pointed to by one of the input parameters). To ensure memory contains correct values, GCC may need to flush specific register values to memory before executing the asm. Further, the compiler does not assume that any values read from memory before an asm remain unchanged after that asm; it reloads them as needed. Using the "memory" clobber effectively forms a read/write memory barrier for the compiler.

Question:

1) What register values are flushed ?

2) Why it needs to be flushed ?

3) Example ?

4) Is there any other overhead apart from register flushing ?

1
actually if you read the same page you'll see the examples that relate to it. Which registers need to be flushed to memory depends on what asm is actually doing to the registers.Ahmed Masud
Just to emphasize: for max performance, the compiler tries to generate code that keeps values in registers whenever possible (ideally until they go out of scope), only flushing to memory when it runs out of registers, something needs to access the actual memory (think: fwrite), or when explicitly told to flush them (memory clobber, function call, etc). Note that no actual assembler instruction is generated from this 'call.' It simply tells the compiler that at that point, the code must be organized in a certain way, even if that way might not appear to the compiler to be the most efficient.David Wohlferd

1 Answers

2
votes

Every memory location which another thread might have a pointer to needs to be up to date before the barrier, and reloaded after. So any such values that are live in registers needed to be stored (if dirty), or just "forgotten about" if the value in a register is just a copy of what's still in memory.

See this gcc non-bug report for this quote from a gcc dev: a "memory" clobber only includes memory that can be indirectly accessed (thus may be address-taken in this or another compilation unit)

Is there any other overhead apart from register flushing ?

A barrier can prevent optimizations like sinking a store out of a loop, but that's usually why you used barriers. Make sure your loop counters and loop variables are locals that haven't had their address passed to functions the compiler can't see, or else they'll have to be spilled/reloaded inside the loop. Letting references escape your function is always a potential problem for optimization, but it's a near-guarantee of worse code with barriers.


Why?

This is the whole point of a barrier: so values are synced to memory, preventing compile-time reordering.

asm volatile( ::: "memory" ) is (exactly?) equivalent to atomic_signal_fence(memory_order_seq_cst) (not atomic_thread_fence, which would take an mfence instruction to implement on x86).


Examples:

See Jeff Preshing's Memory Ordering at Compile Time article for more about why, and examples with actual x86 asm.