I'm having trouble with resurrecting a master node with Sentinel. Specifically, slaves are promoted properly when the master is lost, but the master upon reboot is never demoted. However, if I restart Sentinel immediately the master node is demoted. Is my configuration bad, or am I missing something basic?
EDIT: Xpost with https://groups.google.com/forum/#!topic/redis-db/4AnGNssqYTw
I setup a few VMs as follows, all with Redis 3.1.999:
192.168.0.101 - Redis Slave
192.168.0.102 - Redis Slave
192.168.0.103 - Redis Master
192.168.0.201 - Sentinel
192.168.0.202 - Sentinel
My Sentinel configuration, for both sentinels:
loglevel verbose
logfile "/tmp/sentinel.log"
sentinel monitor redisA01 192.168.0.101 6379 2
sentinel down-after-milliseconds redisA01 30000
sentinel failover-timeout redisA01 120000
I stop redis on the master node; as expected Sentinel catches it and promotes a slave to master.
3425:X 08 Sep 23:47:43.839 # +sdown master redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:43.896 # +odown master redisA01 192.168.0.103 6379 #quorum 2/2
3425:X 08 Sep 23:47:43.896 # +new-epoch 53
3425:X 08 Sep 23:47:43.896 # +try-failover master redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:43.898 # +vote-for-leader 71de0d8f6250e436e1f76800cbe8cbae56c1be7c 53
3425:X 08 Sep 23:47:43.901 # 192.168.0.201:26379 voted for 71de0d8f6250e436e1f76800cbe8cbae56c1be7c 53
3425:X 08 Sep 23:47:43.975 # +elected-leader master redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:43.976 # +failover-state-select-slave master redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:44.077 # +selected-slave slave 192.168.0.102:6379 192.168.0.102 6379 @ redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:44.078 * +failover-state-send-slaveof-noone slave 192.168.0.102:6379 192.168.0.102 6379 @ redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:44.977 * +failover-state-wait-promotion slave 192.168.0.102:6379 192.168.0.102 6379 @ redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:44.980 - -role-change slave 192.168.0.102:6379 192.168.0.102 6379 @ redisA01 192.168.0.103 6379 new reported role is master
3425:X 08 Sep 23:47:44.981 # +promoted-slave slave 192.168.0.102:6379 192.168.0.102 6379 @ redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:44.981 # +failover-state-reconf-slaves master redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:45.068 * +slave-reconf-sent slave 192.168.0.101:6379 192.168.0.101 6379 @ redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:46.031 * +slave-reconf-inprog slave 192.168.0.101:6379 192.168.0.101 6379 @ redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:46.032 * +slave-reconf-done slave 192.168.0.101:6379 192.168.0.101 6379 @ redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:46.101 # -odown master redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:46.101 # +failover-end master redisA01 192.168.0.103 6379
3425:X 08 Sep 23:47:46.102 # +switch-master redisA01 192.168.0.103 6379 192.168.0.102 6379
3425:X 08 Sep 23:47:46.103 * +slave slave 192.168.0.101:6379 192.168.0.101 6379 @ redisA01 192.168.0.102 6379
3425:X 08 Sep 23:47:46.103 * +slave slave 192.168.0.103:6379 192.168.0.103 6379 @ redisA01 192.168.0.102 6379
I wait a few minutes and restart Redis on the former master node. Unexpectedly (to me) the node is not demoted to slave.
3425:X 08 Sep 23:48:16.105 # +sdown slave 192.168.0.103:6379 192.168.0.103 6379 @ redisA01 192.168.0.102 6379
3425:X 08 Sep 23:50:09.131 # -sdown slave 192.168.0.103:6379 192.168.0.103 6379 @ redisA01 192.168.0.102 6379
After waiting a few more minutes, I restart one of the sentinels. Immediately it detects the dangling former master node and demotes it.
3425:signal-handler (1441758237) Received SIGTERM scheduling shutdown...
...
3670:X 09 Sep 00:23:57.687 # Sentinel ID is 71de0d8f6250e436e1f76800cbe8cbae56c1be7c
3670:X 09 Sep 00:23:57.687 # +monitor master redisA01 192.168.0.102 6379 quorum 2
3670:X 09 Sep 00:23:57.690 - -role-change slave 192.168.0.103:6379 192.168.0.103 6379 @ redisA01 192.168.0.102 6379 new reported role is master
3670:X 09 Sep 00:23:58.708 - Accepted 192.168.0.201:49731
3670:X 09 Sep 00:24:07.778 * +convert-to-slave slave 192.168.0.103:6379 192.168.0.103 6379 @ redisA01 192.168.0.102 6379
3670:X 09 Sep 00:24:17.801 - +role-change slave 192.168.0.103:6379 192.168.0.103 6379 @ redisA01 192.168.0.102 6379 new reported role is slave