Here we have a code of postbox code for data communication between two ARM cores (directly referred from the ARM Cortex A Series Programming Guide).
Core A:
STR R0, [Msg] @ write some new data into postbox
STR R1, [Flag] @ new data is ready to read
Core B:
Poll_loop:
LDR R1, [Flag]
CMP R1,#0 @ is the flag set yet?
BEQ Poll_loop
LDR R0, [Msg] @ read new data.
In order to enforce dependency, the document says that we need to insert not one, but two memory barriers, DMB, into the code.
Core A:
STR R0, [Msg] @ write some new data into postbox
DMB
STR R1, [Flag] @ new data is ready to read
Core B:
Poll_loop:
LDR R1, [Flag]
CMP R1,#0 @ is the flag set yet?
BEQ Poll_loop
DMB
LDR R0, [Msg] @ read new data.
I understand the first DMB in the Core A: it prevents compile reordering and also the memory access to [Msg] variable be observed by the system. Below is the definition of the DMB from the same document.
Data Memory Barrier (DMB)
This instruction ensures that all memory accesses in program order before the barrier are observed in the system before any explicit memory accesses that appear in program order after the barrier. It does not affect the ordering of any other instructions executing on the core, or of instruction fetches.
However, I am not sure why the DMB in the Core B is used. In the document it says:
Core B requires a DMB before the LDR R0, [Msg] to be sure that the message is not read until the flag is set.
If the DMB in the Core A makes the store to the [Msg] be observed to the system, then we should not need the DMB in the second core. My guess is, the compiler might do a reordering of reading [Flag] and [Msg] in the Core B (though I do not understand why it should do this since the read on [Msg] is dependent on [Flag]).
If this is the case, a compile barrier (asm volatile("" ::: "memory) instead of DMB should be enough. Do I miss something here?