1
votes

I am trying to test my software behavior during cluster failover, and for that reason I want to configure a simplest cluster: one master and two slaves. I have tree files 7000.conf - 7002.conf of the following content:

port 7000
cluster-config-file nodes.7000.conf
appendfilename appendonly.7000.aof
dbfilename dump.7000.rdb
pidfile /var/run/redis_7000.pid

include cluster.conf

The content of cluster.conf:

cluster-enabled yes
appendonly yes
maxclients 100
daemonize yes
cluster-node-timeout 2000
cluster-slave-validity-factor 0

I've configured then that 7000 runs all slots from 0 to 16383, and 7001 and 7002 are replicas of 7000:

XXX 127.0.0.1:7002 slave YYY 0 1511389011347 4 connected
YYY 127.0.0.1:7000 myself,master - 0 0 4 connected 0-16383
ZZZ 127.0.0.1:7001 slave YYY 0 1511389011246 4 connected

Then I try to get rid of 7000 - via shutdown command, or via killing a process. One of the slaves should promote itself to master, but none does:

ZZZ 127.0.0.1:7001 slave YYY 0 0 3 connected
YYY 127.0.0.1:7000 master,fail? - 1511389104442 1511389103933 4 disconnected 0-16383
XXX 127.0.0.1:7002 myself,slave YYY  0 1511389116543 4 connected

I've waited for like minutes, and my slaves not want to become master. If I force a slave to become master via cluster failover takeover, it's more than happy to do so (and if I restart master, it becomes slave), but not automatically.

I've tried to play with cluster-node-timeout - does not help.

Am I doing something wrong? Redis version is 3.2.11.

2

2 Answers

3
votes

The issue is that a redis-cluster has a minimum size of 3 masters to get automatic failover working. It's the master nodes that watch each other, and detect the failover, so with a single master in the cluster there is no processes running are able to detect that your one master is down. The minimum of three, is to make sure that in the case of any downed node, the majority of the entire cluster needs to agree, so at the minimum you need 3 nodes, to still have more than half of them around to reach a majority view in case of failure.

The Redis-cluster tutorial mentions this in the following section: https://redis.io/topics/cluster-tutorial#creating-and-using-a-redis-cluster

"Note that the minimal cluster that works as expected requires to contain at least three master nodes."
0
votes

Please note that even with 3 masters the automatic failover is not guaranteed if the failure happens like below in the cluster: (M-Master / S-Slave)

Node-1: M1 S3

Node-2: M2 S1

Node-3: M3 S2

Now if node 3 fails, then its slave S3 in Node-1 is promoted as Master automatically.All is well with following status after the Node-3 recovers:

Node-1: M1 M3 <----- Please note 2 Masters in Node-1 now with S3 become M3 in prev step.

Node-2: M2 S1

Node-3: S3 S2 <----- Please note that the redis-server came up as Slave(was M3 before)

Now you might think the cluster will continue to handle failures easily since 3 masters are there in this setup. However, if Node-1 fails the Cluster is DOWN due to quorum not satisfied and never gets up unless we do some manual adjustments.

Hope this helps.