0
votes

I wrote a monitoring program to monitor the health of my Redis Sentinel HA cluster and it flagged that one slave is missing, node 10.10.10.30. After some debugging it turns out that slaves who are in sdown state true are filtered out.

My system consists of three nodes, 1 master, two slaves. Each node has sentinel deployed on it.

On the master, if I log on to redis-cli, the following is reported:

127.0.0.1:6379> info replication
# Replication
role:master
connected_slaves:2
slave0:ip=10.10.10.8,port=6379,state=online,offset=1409435252945,lag=1
slave1:ip=10.10.10.30,port=6379,state=online,offset=1409436519147,lag=1
master_repl_offset:1409439031250
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1409437982675
repl_backlog_histlen:1048576

All my redis servers as well as sentinels on each of my machines are up and running.

If I execute redis-cli -p 26379 on any of my machines and run sentinel slaves mymaster. I get a report of the same number of slaves as I have configured and running. However, node 10.10.10.30 is reporting this:

2)  1) "name"
    2) "10.10.10.30:6379"
    3) "ip"
    4) "10.10.10.30"
    5) "port"
    6) "6379"
    7) "runid"
    8) ""
    9) "flags"
   10) "s_down,slave,disconnected"
   11) "pending-commands"
   12) "0"
   13) "last-ping-sent"
   14) "936737"
   15) "last-ok-ping-reply"
   16) "936737"
   17) "last-ping-reply"
   18) "936737"
   19) "s-down-time"
   20) "931725"
   21) "down-after-milliseconds"
   22) "5000"
   23) "info-refresh"
   24) "1589412820130"
   25) "role-reported"
   26) "slave"
   27) "role-reported-time"
   28) "936737"
   29) "master-link-down-time"
   30) "0"
   31) "master-link-status"
   32) "err"
   33) "master-host"
   34) "?"
   35) "master-port"
   36) "0"
   37) "slave-priority"
   38) "100"
   39) "slave-repl-offset"
   40) "0"

I don't understand how to get that node out of sdown state. All redis machines and sentinel deployments are using ports 6379 and 26379 respectively and the port are reachable.

1

1 Answers

0
votes

I compared redis.conf and sentinel.conf with slaves that have no issue. The difference was the bind address. I changed it from 127.0.0.1 to bind 0.0.0.0 and restarted redis. The sdown state went away.