When reading about consistency models (namely the x86 TSO), authors in general resort to models where there are a bunch of CPUs, their associated store buffers and their private caches.
If my understanding is correct, store buffers can be described as queues where CPUs may put any store instruction they want to commit to memory. So as the name states, they are store
buffers.
But when I read those papers, they tend to talk about the interaction of loads and stores, with statements such as "a later load can pass an earlier store" which is slightly confusing, as they almost seem to be talking as if the store buffer would have both loads and stores, when it doesn't -- right?
So there must be also be a load store that they are not (at least explicitly) talking about. Plus, those two must be somehow synchronized, so both know when it's acceptable to load from memory and to commit to memory -- or am I missing something?
Can anyone shed some more light into this?
EDIT:
Let's look at a paragraph out of "A primer on memory consistency and cache coherence":
To understand the implementation of atomic RMWs in TSO, we consider the RMW as a load immediately followed by a store. The load part of the RMW cannot pass earlier loads due to TSO’s ordering rules. It might at first appear that the load part of the RMW could pass earlier stores in the write buffer, but this is not legal. If the load part of the RMW passes an earlier store, then the store part of the RMW would also have to pass the earlier store because the RMW is an atomic pair. But because stores are not allowed to pass each other in TSO, the load part of the RMW cannot pass an earlier store either
more specifically,
The load part of the RMW cannot pass earlier loads due to TSO’s ordering rules. It might at first appear that the load part of the RMW could pass earlier stores in the write buffer
so they are referring to loads / stores crossing each other in the write buffer (which I assume is the same thing as the store buffer?)
Thanks