5
votes

When using persistent memory like Intel optane DCPMM, is it possible to see partial result after reboot if system crash(power outage) in execution of movnt instruction?

For:

  • 4 or 8 byte movnti which x86 guarantees atomic for other purposes?
  • 16-byte SSE movntdq / movntps which aren't guaranteed atomic but which in practice probably are on CPUs supporting persistent memory.
  • 32-byte AVX vmovntdq / vmovntps
  • 64-byte AVX512 vmovntdq / vmovntps full-line stores
  • bonus question: MOVDIR64B which has guaranteed 64-byte write atomicity, on future CPUs that support it and DC-PM. e.g. Sapphire Rapids Xeon / Tiger Lake / Tremont.

movntpd is assumed to be identical to movntps.


Related questions:

1
@Peter Cordes Very thanks to your professional editing and answers! - dangzzz
Despite clflush itself apparently being atomic, it's still true that it doesn't give any guarantee of gluing together two separate stores into one atomic persistence; one could still commit to persistence before clflush, and then the system crashes. So my commentary on that linked question (which this is a followup to) is still somewhat accurate and relevant: it doesn't work like that when the goal is to atomically write stuff to persistent storage. - Peter Cordes
@Peter Cordes Do you mean that the former write may become persistent before clflush because cache line eviction or something else?Two separate stores can't be persistent atomically, but the order of their persistence will not change, right? - dangzzz
Oh right, I forgot ordering, not atomicity, was your real concern. If split or out-of-order write-backs within a line are impossible (whether by clflush or other means, e.g. interrupt after both stores but before clflush, leading to eviction), then yeah global observability order should apply to persistence order for writes within the same cache line. That's what I expected would be the case, but documentation left open the possibility of reordering. Fortunately Hadi got confirmation that reality matches expectations. - Peter Cordes

1 Answers

4
votes

Atomicity guarantees on x86 in global observability and persistency are the same. This means that the following operations are persistently atomic:

  • A store uop that doesn't cross an 8-byte boundary to a location of any effective memory type, and
  • MOVDIR64B.

In addition, the following operations are persistently atomic:

  • A cache line flush (CLFLUSH or CLFLUSHOPT),
  • A cache line writeback (CLWB), and
  • A non-architectural cache line eviction.
  • A full write-combining buffer flush on Intel processors. The presence and size of WCBs and the causes of flush are implementation-specific. See: Ordering of Intel non-temporal stores to the same cache line.

There is no architectural persistent atomicity guarantee for everything else, including 64-byte AVX512 vmovntdq / vmovntps full-line stores.

These guarantees apply to Asynchronous DRAM Refresh (ADR) platforms and Enhanced Asynchronous DRAM Refresh (eADR) platforms. (On eADR, the cache hierarchy is in the persistence domain. See: Build Persistent Memory Applications with Reliability Availability and Serviceability.)

This answer is based on my private correspondence with Andy Rudoff (Intel).