With respect to the ordering I describe below I have some related questions.
Given these ordering guarantees I don't need explicit fences in many places. However, how can I express the "fence" to the compiler, in particular GCC? That is, the guarantee of program order only applies so long as the optimizer doesn't reorder my program.
Are there common/popular new chips in use that have general purpose cores that do not offer such guarantees?
I'm a bit confused in C++0x with its idea of interleaving. Must I use the "atomic" class to make use of these guarantees, or is there some other aspect in the draft which also provides a way to make use of these guarantees?
Memory Ordering
Both Intel and AMD, at least with x86_64, guarantee that memory loads are sequential with respect to the store operations done on a single processor. That is, if some processor executes these stores:
- Store A <- 1
- Store B <- 2
- Store C <- 3
The moment some other processor sees C(3) it is guaranteed to also see the previous stores A(1) and B(2). Now, the visibility between processors may be interleaved, but the store order from any given processor will also be sequential.
They also have transitive guarantees when Processor 0 reads a value stored by Processor 1, then writes a value, that Processor 2 reading the new value must also see that value from Processor 1.
Ignore the special cases dealing with IO and special devices. I'm interested only in the general memory guarantees: my ordering here is just the bit I'm most interested in as it has the most significance for concurrent algorithms.