24
votes

Phase 2. (a) If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals.

As mentioned in the paper,

A proposer issues a proposal by sending, to some set of acceptors, a request that the proposal be accepted. (This need not be the same set of acceptors that responded to the initial requests.)"

But as my understanding, if we change Phase 2. (a) to:

If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to an arbitrary set of majority acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals.

the algorithm will fail, following is an example. Consider that there are totally 3 acceptors ABC. We will use X(n:v,m) to denote the status of acceptor X: proposal n:v is the largest numbered proposal accepted by X where n is the proposal number and v is the value of the proposal, and m is the number of the largest numbered prepare request that X has ever responded.

  1. P1 sends 'prepare 1' to AB
  2. Both AB respond P1 with a promise to not to accept any request numbered smaller than 1. Now the status is: A(-:-,1) B(-:-,1) C(-:-,-)
  3. P1 receives the responses, then gets stuck and runs very slowly
  4. P2 sends 'prepare 100' to AB
  5. Both AB respond P2 with a promise to not to accept any request numbered smaller than 100. Now the status is: A(-:-,100) B(-:-,100) C(-:-,-)
  6. P2 receives the responses, chooses a value b and sends 'accept 100:b' to BC
  7. BC receive and accept the accept request, the status is: A(-:-,100) B(100:b,100) C(100:b,-). Note that proposal 100:b has been chosen.
  8. P1 resumes, chooses value a and sends 'accept 1:a' to BC
  9. B doesn't accept it, but C accepts it because C has never promise anything. Status is: A(-:-,100) B(100:b,100) C(1:a,-). The chosen proposal is abandon, Paxos fails.

Did I miss anything here? Thanks.

8
Bravo! I'd give this question multiple up-votes if I could.Michael Deardeuff
yes you are indeed correct and yes it is a bug in the multipaxos implementation I am working on. thanks!simbo1905
Your description is almost correct but it's based on an all-too-easy misunderstanding of the uniqueness property of proposal ids/round-numbers. Each proposal id/round number must be unique. Re-use is not allowed or the exact condition you describe is possible. See my answer for further details.Rakis
@rakis - I disagree about the claim that proposal numbers have to be unique across proposers. If they are guaranteed to be unique, it ensures liveness of the paxos algorithm. But its not necessary that they are unique. The paxos maintains its safety property when they aren't unique. But it doesn't guarantee liveness when proposal numbers are not unique. In a dynamic environment, where proposers come and go, it's not possible to statically assign unique proposal numbers to proposers. In such an environment, assigning these unique numbers itself becomes a problem of consensus. Also, assigning unseattlesparty
9. C will not accept it, because n=1 is less than n=100. both accepted number and promised number are considered. only accept if <n && <m, both must meet.ideawu

8 Answers

13
votes

You missed something in step 7. When C processes accept 100:b it sets its state to C(100:b,100). By accepting a value the node is also promising to not accept earlier values.


Update. I've been thinking about this all month because I knew the above answer was not absolutely correct.

What's more I looked through several proprietary and open-source paxos implementations and they all had the bug submitted by the OP!

So here's the correct answer, when viewed entirely from Paxos Made Simple:

If the proposer receives a response to its prepare requests (numbered n) from a majority of acceptors, then it sends an accept request to each of those acceptors for a proposal numbered n with a value v, where v is the value of the highest-numbered proposal among the responses, or is any value if the responses reported no proposals. (emphasis mine)

In other words, the proposer can only send Accept messages to acceptors that it has received Promises from for that ballot number.

So, is this a contradiction in Lamport's paper? Right now, I'm saying yes.


If you look at Lamport's paxos proofs he treats an accept as a promise, just as my original answer suggests. But this is not pointed out in Paxos Made Simple. In fact, it appears Lamport took great pains to specify that an accept was not a promise.

The problem is when you combine the weaker portions of both variants; as the OP did and several implementations do. Then you run into this catastrophic bug.

6
votes

There is certainly no problem with broadcasting the accept request to all acceptors. You don't need to restrict it to just the ones that replied to the original prepare request. You've found a rare case of bad wording in Dr Lamport's writing.

There is, however, a bug in your counterexample. Firstly, the notation is defined like this:

X(n:v,m) to denote the status of acceptor X: proposal n:v is the largest numbered proposal accepted by X

But then in step 7 node C has state C(100:b,-), and then in step 9 it's changed to state C(1:a,-). This is not a valid transition: after accepting 1:a it should remain in state C(100:b,-) since 100:b is still the largest numbered proposal accepted by C.

Note that it's perfectly fine that it accepts 1:a after 100:b, essentially because the network is asynchronous so all messages can be delayed or reordered without breaking anything, so the rest of the world can't tell which proposal was accepted first anyway.

2
votes

NECROBUMP. Even with the weaker portion of both variants, there is no inconsistency. Let's look at step 9 in the example in the question:

"The state is A(-:-,100) B(100:b,100) C(1:a,-). The chosen proposal is abandon, Paxos fails"

However, at this point all we have is an indeterminate value, since there is no majority accepted value (we must eventually choose 'b' since b was accepted by a majority during step 6.)

In order to continue the protocol, we need new ballots and eventually some newer ballot will be accepted. That ballot must have the value 'b',

PROOF: C will respond with (100, 'b') on any prepare requests since the highest-numbered ballot it accepted is (100, 'b') even if it last accepted a ballot (1, 'a'). B will also respond with (100, 'b'). Hence it is no longer possible to get a majority ballot with any value but 'b'.

Lamport's language is that an acceptor will respond with "The proposal with the highest number less than n that it has accepted, if any"

The accepted answer confuses "highest numbered" with "latest accepted," since the example shows that an acceptor may accept values in decreasing numbered order. In order to completely align with Lamport's protocol, it is necessary for C to remember that it responded to (100, 'b') even if the "latest" accept it has made is (1, 'a').

(That being said I would not be surprised if many implementations don't do this correctly, and hence are vulnerable to this issue.)

2
votes

There is indeed an ambiguity in the paper, which is why TLA+ specification, and not the paper should be used for implementing the algorithm.

When accepting a value, an acceptor must once again update its state, namely the most recently promised ballot. This is clear from Paxos TLA+ specification, check out Phase 2b in which acceptor updates maxBal, and compare with Phase 1b where it does the same.

Leslie Lamport handles this question in his recent lecture, where he explains that this is done specifically to allow the set of acceptors to be different from the set of nodes promising the ballot.

0
votes

C can't accept the proposal as it hasn't gone through Phase 1. IOWs for a vale to be accepted by an acceptor, the acceptor has to move through both phases of the protocol.

0
votes

if by accepting a value the node is also promising to not accept earlier values, the algorithm is correct, but in the paper Lamport didn't mention this requirement, right?

The above condition is not required. Let's say the highest ballot an acceptor has promised is X. Let's say the incoming accept message has ballot number Y. If Y < X, we know that Y has to be rejected. If Y > X, this means that the acceptor hasn't received a prepare request for Y. This means, we have received an invalid paxos message. In this case, the accept message for Y should be dropped.

The only exception to this is when Y == 0. In this case, issuing a prepare with ballot number 0 doesn't make sense as ballot numbers below 0 are invalid. So, phase 1 can be skipped for ballot 0 and a proposer can directly go to Phase 2. In this case, i.e. when Y == 0, the acceptor can accept the value only if it hasn't accepted a value. This is the same as what you are proposing above, but it is required only in the optimized version of Paxos where Phase 1 can be skipped for Y == 0.

IOWs, the only time an acceptor accepts a value is when Y == X. The only exception is when Y == 0. In that case, the acceptor can accept the value only if it hasn't accepted a value.

0
votes

I agree with most part of Ben Braun's answer.

It's fine for C to accept (1,a), it doesn't change the chosen value. Let's say C accepted (1, a), and we take a look at the accept history from the perspective of a learner.

(100, b) accepted by B and C
(1, a) is accepted by C

(100, b) is chosen since it is accepted by majority of acceptors.

At this point the protocol doesn't need to continue if the learner got complete accept history, unless learners failed or messages to learners are lost. This is where I disagree with Ben Braun's answer.

But the acceptors should keep the accepted proposal with the highest number, in case a new proposal is issued.

update: I also agree with Dave Turner that in reality there is no reason to accept lower numbered proposal. Proposal number is like logical time clock, it's safe to ignore older messages.

0
votes

The ambiguous sentence in Paxos Made Simple is "This need not be the same set of acceptors that responded to the initial requests".

Its actual meaning is "Hi, let me give you a hint here. The algorithm described in this paper can be optimized to eliminate the requirement that the prepare phase and the accept phase must have the same set of acceptors". Note that the algorithm described in Paxos Made Simple is a little different from the one described in The Part-Time Parliament.

However, some people misunderstood that sentence like this: "The algorithm described in this paper does not require that the prepare phase and the accept phase must have the same set of acceptors".