Backgound:
In section 3, named Implementing a State Machine, of Lamport's paper Paxos Made Simple, Multi-Paxos is described. Multi-Paxos is used in Google Paxos Made Live. (Multi-Paxos is used in Apache ZooKeeper). In Multi-Paxos, gaps can appear:
In general, suppose a leader can get
α
commands ahead--that is, it can propose commandsi + 1
throughi + α
commands after commands 1 throughi
are chosen. A gap of up toα - 1
commands could then arise.
Now consider the following scenario:
The whole system uses master-slave architecture. Only the master serves client commands. Master and slaves reach consensus on the sequence of commands via Multi-Paxos. The master is the leader in Multi-Paxos instances. Assume now the master and two of its slaves have the states (commands have been chosen) shown in the following figure:
.
Note that, there are more than one gaps in the master state. Due to asynchrony, the two slaves lag behind. At this time, the master fails.
Problem:
What should the slaves do after they have detected the failure of the master (for example, by heartbeat mechanism)?
In particular, how to handle with the gaps and the missing commands with respect to that of the old master?
Update about Zab:
As @sbridges has pointed out, ZooKeeper uses Zab instead of Paxos. To quote,
Zab is primarily designed for primary-backup (i.e., master-slave) systems, like ZooKeeper, rather than for state machine replication.
It seems that Zab is closely related to my problems listed above. According to the short overview paper of Zab, Zab protocol consists of two modes: recovery and broadcast. In recovery mode, two specific guarantees are made: never forgetting committed messages and letting go of messages that are skipped. My confusion about Zab is:
- In recovery mode does Zab also suffer from the gaps problem? If so, what does Zab do?