A simple question: is a LOCK CMPXCHG possible on non-cached memory, ie pages marked in the page-table as non-cached?
1 Answers
The content of this answer closely resembles the content of this Dr Dobbs' article, particularly the "Locking" section, which I consulted to understand the locking on a QuickPath Interconnect (QPI) enabled systems.
As such this post has been marked as a "community wiki".
Yes, it's possible.
The 8086 had no cache but was able to perform atomic operations.
This was accomplished thanks to the introduction of the #lock signal in the FSB. When this signal was asserted, no new transaction could be started by any agent—only the locking one could be executed (actually, not even the locking one sometimes)—thereby quiescing the system.
With the introduction of caching, the need for a bus lock was reduced. The processor can operate its cache by delaying any snooping request from other agents for the duration of the lock.
However, the legacy bus lock was preserved due to backwards compatibility and because the guarded variable could span two cache rows.
When the FSB was dropped in favour of QPI (think of the abandonment of the hub architecture and of multi-socket systems), the #lock signal was dropped, too.
Now, one of the QPI agents is designed as a Quiesce Master (QM). When a processor wants a lock, it asks the QM, which in turn informs the other agents—including DMA agents—to stop any future request.
When every agent has acknowledged to the QM, it informs the lock requester that the system is locked. The atomic operation is then carried out, and upon completion, an unlock requested is presented to the QM. Finally, the QM will proceed with informing the other agents that new transactions are allowed again.
In this way, the mechanisms for locking the entire memory subsystem are still present and functional in modern designs.