0
votes

I'm learning Raft algorithm. My implementation meets following situation:

  1. 1-leader-1-follower situation is established;
  2. shutdown the leader;
  3. follower gets no heartbeat so then becomes a candidate;
  4. candidate keeps sending VoteRequest to the peer (already shutdown) and fails;
  5. election timeout without any leader elected;
  6. candidate starts another candidate session, actually repeats 4-6 ...

I don't see how to solve this situation in Raft papers (maybe I missed something).

In my opinion I can check granted votes in step-5 before starting a new election. Since candidate votes for itself in the beginning of election session, so in this check, the candidate will become a new leader.

But I worry about this solution will break Raft, especially breaking the initial process when all nodes are candidates.

Another idea is treating the network error of RequestVote requests as "Vote Granted". (still worry about if it breaks something)

I know this situation could be caused by 'only 2 nodes'. However even if there are 3 nodes (so 1-leader-2-follower situation established), then if 2 leaders are shut down consequently, the remain follower may still behave like this.

1
Can you expand on 'two leaders shutdown consequently'? Do you mean both nodes die a horrible death? Raft is a 2F+1 system, meaning you need that many nodes to tolerate F node failures.Michael Deardeuff

1 Answers

0
votes

What you are describing as a problem is actually a legit situation.

Raft will not work if the majority of nodes are not present, and there is no way to avoid this besides getting the majority of nodes back in function.