Why is compare-and-swap (CAS) algorithm a good choice for lock-free synchronization?

Question

CAS belongs to the read-modify-write (RMW) family, a set of algorithms that allow you to perform complex transactions atomically.

Specifically, Wikipedia says that

CAS is used to implement synchronization primitives like semaphores and mutexes, as well as more sophisticated lock-free and wait-free algorithms. [...] CAS can implement more of these algorithms than atomic read, write, or fetch-and-add, and assuming a fairly large amount of memory, [...] it can implement all of them.

https://en.wikipedia.org/wiki/Compare-and-swap#Overview

So it seems that a CAS algorithm is the "one size fits all" product of its category. Why is that so? What do other RMW algorithms lack of? If CAS is the best tool, what are the other algorithms for?

CAS is typically implemented on the processor itself, so the processor guarantees atomicity. Which makes it rather efficient, but easier as well, since programmers do not have to care how it is implemented, as long as the processor does a proper job. — Willem Van Onsem

Eric Eric · Accepted Answer · 2019-05-30T17:36:00

CAS belongs to a class of objects called "consensus objects", each of which has a consensus number; the maximum number of threads for which a given consensus object can solve the consensus problem.

The consensus problem goes like this: for some number of threads n, propose some value p and decide on one of the proposed values d such that n threads agrees on d.

CAS is the most "powerful" consensus object because its consensus number is infinite. That is, CAS can be used to solve the consensus problem among a theoretically infinite number of threads. It even does it in a wait-free manner.

This can't be done with atomic registers, test-and-set, fetch-add, stacks since they all have finite consensus numbers. There are proofs for these consensus numbers but that is another story...

The significance of all of this is that it can be proven that there exists a wait-free implementation of an object for n threads using a consensus object with a consensus number of at least n. CAS is especially powerful because you can use it to implement wait-free objects for an arbitrary number of threads.

As to why other RMW operations are useful? Some problems in multiprocessing don't really involve solving the consensus problem for some arbitrary number of threads. For example, mutual exclusion can be solved using less powerful RMW operations like test-and-set (a simple TAS lock), fetch-add (ticket lock) or atomic swap (CLH lock).

More information on consensus for shared memory at Wikipedia Consensus (computer_science) section: In_shared-memory_systems

Also, there's a whole chapter on consensus and universal constructions in Herlihy and Shavit's The Art of Multiprocessor Programming (WorldCat) that I highly recommend.

Why is compare-and-swap (CAS) algorithm a good choice for lock-free synchronization?

1 Answers