What does the C++ compiler do to ensure that different but adjacent memory locations are safe to be used on different threads?

Question

Lets say I have a struct:

struct Foo {
  char a;  // read and written to by thread 1 only
  char b;  // read and written to by thread 2 only
};

Now from what I understand, the C++ standard guarantees the safety of the above when two threads operate on the two different memory locations.

I would think though that, since char a and char b, fall within the same cache line, that the compiler has to do extra syncing.

What exactly happens here?

On a lot of platforms (for example, x86), it doesn't have to do anything. It just works (it means that the HW does the necessary extra stuff). — geza
Yes. But the exact hit could vary on different generations/vendors of CPU. Do a search on "false sharing". — geza
This is handled by the hardware, not the compiler, as far as I am aware. This is called false sharing — NathanOliver
I think the only CPUs that C++ has actually been implemented on that the compiler would have to do anything special to support the C++ memory model are early Alpha CPUs which lacked instructions that could atomically set a single byte (or 16-bit) memory location. See Peter Cordes answer to a related question for details: stackoverflow.com/a/46818162/3826372 As far as know there's no compiler implementations that have been updated to support the C++11 memory model on these long obsolete Alpha CPUs. — Ross Ridge
@RossRidge - well, there's the even more obsolete TMS9900 that has the same issue (only with 8-bit values, as it assumes 16-bit alignment) -- there's a port of gcc 3.4 to the architecture, but I don't know whether any particular C++ variant is supported. I'm also not aware of any extant dual-processor TMS9900 machines that would ever actually see this issue. The TMS9900 has instructions that operate on single bytes, but the bus implementation always fetches both bytes in a 16-bit word and rewrites the unchanged one. — Jules

SergeyA SergeyA · Accepted Answer · 2018-11-19T15:11:42

This is hardware-dependent. On hardware I am familiar with, C++ doesn't have to do anything special, because from hardware perspective accessing different bytes even on a cached line is handled 'transparently'. From the hardware, this situation is not really different from

char a[2];
// or
char a, b;

In the cases above, we are talking about two adjacent objects, which are guaranteed to be independently accessible.

However, I've put 'transparently' in quotes for a reason. When you really have a case like that, you could be suffering (performance-wise) from a 'false sharing' - which happens when two (or more) threads access adjacent memory simultaneously and it ends up being cached in several CPU's caches. This leads to constant cache invalidation. In the real life, care should be taken to prevent this from happening when possible.

What does the C++ compiler do to ensure that different but adjacent memory locations are safe to be used on different threads?

2 Answers