How can leader replicate logs when a follower recovers with a large term number in RAFT?

Question

Say, I have 3 nodes in raft. Because of a network failure, node 3 is separated from the other 2 nodes. Then node 3 just always becomes a candidate, sends requestVote args to others and find it cannot get enough votes. Then node 3 increment its term and tries to request votes again. So node 3's term is significantly larger than the others, who commit logs 102, 103, 104, 105.

After a while, the network recovers, and node 3 joins the group again and becomes a follower. However, due to its large term, it always reject AppendEntries from the leader(node 1). How can node 3 recover logs from 102 to 105?

Node 1(leader):
* logs [101, 102, 103, 104, 105]
* term [1,   2,   2,   2,   2  ...]
Node 2 (follower)
* logs [101, 102, 103, 104, 105]
* term [1,   2,   2,   2,   2  ...]
Node 3 
* logs [101]
* term [1,   2,   3,   4,   5  ...]

kuujo kuujo · Accepted Answer · 2016-10-07T18:47:51

You have to look at how the leader will handle a response from that follower after it rejoins the cluster. When the leader receives an AppendEntries response indicating another node has a higher term, the leader will update its own term and step down to force a new election. During the election protocol, all the candidates will also discover the higher term and update their own terms. Then, some leader that still has all the committed entries will be elected and, having a term >= that partitioned follower, will replicate the committee entries to it.

How can leader replicate logs when a follower recovers with a large term number in RAFT?

1 Answers