0
votes

In a reliable distributed system leader election is mandatory for write success and I can understand that it's required to follow the Paxos algorithm.

However, why is a leader election (thus consensus) not required for a read request? (e.g. in Zookeeper)

Am I missing something?

1

1 Answers

2
votes

Zookeeper reads are not linearizable, so no consensus coordination is needed. They are instead sequentially consistent, permitting local reads from the node the client is connected to.

The same is true of, e.g. Raft. You can perform local reads and get at most sequential consistency (provided you coordinate with the node to not read data older than what you've seen), but if you want a linearizable read you must "commit" a read operation (that is, have the system agree on which writes are committed prior to your read), again requiring consensus.

Zookeeper is not linearizable. See e.g. https://github.com/jepsen-io/jepsen/issues/399. It is common myth that one can sync+read (and, prior to this edit, I repeated the myth here), but to quote Zookeeper docs:

There is a caveat to the use of sync, which is fairly technical and deeply entwined with ZooKeeper internals. (Feel free to skip it.) Because ZooKeeper is supposed to serve reads fast and scale for read-dominated workloads, the implementation of sync has been simplified and it doesn't really traverse the execution pipeline as a regular update operation, like create, setData, or delete. It simply reaches the leader, and the leader queues a response back to the follower that sent it. There is a small chance that the leader thinks that it is the leader l, but doesn't have support from a quorum any longer because the quorum now supports a different leader, lʹ . In this case, the leader l might not have all updates that have been processed, and the sync call might not be able to honor its guarantee.