This is the current setup:
One redis server in intranet and 2 Node js servers each with 8 PM2 instances running in DMZ. There is a Firewall between the node servers and the redis server. We want to know why PM2 keeps restarting every hour or so approximately, and why connections in redis server keep increasing. We looked at pm2 logs and some of the errors we found are:
This is redis conf:
# General
daemonize yes
pidfile "/var/run/redis/6379.pid"
dir "/apps/redis/6379"
port 6379
bind 0.0.0.0
timeout 0
tcp-keepalive 0
tcp-backlog 511
loglevel notice
logfile "/apps/log/redis/reids.log"
syslog-enabled yes
syslog-ident "redis_6379"
syslog-facility user
databases 16
# Snapshotting
save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename "dump.rdb"
# Replication
slave-serve-stale-data yes
slave-read-only yes
repl-disable-tcp-nodelay no
slave-priority 100
min-slaves-max-lag 10
# Security
# Limits
maxclients 10000
maxmemory-policy noeviction
# Append Only Mode
appendonly no
appendfilename "appendonly.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# Lua
lua-time-limit 5000
# Slow Log
slowlog-log-slower-than 10000
slowlog-max-len 128
# Event Notification
notify-keyspace-events ""
# Advanced
hash-max-ziplist-entries 512
hash-max-ziplist-value 64
list-max-ziplist-entries 512
list-max-ziplist-value 64
set-max-intset-entries 512
zset-max-ziplist-entries 128
zset-max-ziplist-value 64
activerehashing yes
client-output-buffer-limit normal 0 0 0
client-output-buffer-limit slave 256mb 64mb 60
client-output-buffer-limit pubsub 32mb 8mb 60
hz 10
aof-rewrite-incremental-fsync yes
# Generated by CONFIG REWRITE
This is the erratic behaviour observed in pm2 and netstat:
...
...
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:hacl-gs ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:13039 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:23784 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:32191 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:18506 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:30892 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:26435 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:29383 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:28424 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:18174 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:3364 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:28449 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:20975 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:23668 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:21041 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:23206 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:14224 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:29325 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:28642 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:aap ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:14575 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:11651 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:5982 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:10445 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:22765 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:12192 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:8926 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:32150 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:10471 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:29948 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:25831 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:28012 ESTABLISHED
tcp 0 0 vidmarredispdb4.te:6379 10.54.22.226:5119 ESTABLISHED
[apps@vidmarredispdb4 redis]$ netstat | grep 6379 | grep 226 | wc -l
168
[apps@vidmarredispdb4 redis]$ netstat | grep 6379 | grep 227 | wc -l
40
pm2 restarts and times are erratic:
[apps@vidmarnjspapp1 ~]$ pm2 list
┌─────────────┬────┬─────────┬───────┬────────┬─────────┬────────┬─────┬───────────┬──────────┐
│ App name │ id │ mode │ pid │ status │ restart │ uptime │ cpu │ mem │ watching │
├─────────────┼────┼─────────┼───────┼────────┼─────────┼────────┼─────┼───────────┼──────────┤
│ Pm2Http9615 │ 0 │ fork │ 23300 │ online │ 1 │ 39h │ 0% │ 25.1 MB │ disabled │
│ video-api │ 1 │ cluster │ 13434 │ online │ 3 │ 2h │ 0% │ 67.7 MB │ disabled │
│ video-api │ 2 │ cluster │ 23306 │ online │ 1 │ 39h │ 0% │ 77.8 MB │ disabled │
│ video-api │ 3 │ cluster │ 23312 │ online │ 1 │ 39h │ 0% │ 71.5 MB │ disabled │
│ video-api │ 4 │ cluster │ 6073 │ online │ 4 │ 20h │ 0% │ 68.1 MB │ disabled │
│ video-api │ 5 │ cluster │ 18850 │ online │ 2 │ 24h │ 0% │ 74.8 MB │ disabled │
│ video-api │ 6 │ cluster │ 23342 │ online │ 1 │ 39h │ 0% │ 74.2 MB │ disabled │
│ video-api │ 7 │ cluster │ 27569 │ online │ 2 │ 28h │ 0% │ 71.5 MB │ disabled │
│ video-api │ 8 │ cluster │ 23391 │ online │ 1 │ 39h │ 0% │ 78.2 MB │ disabled │
└─────────────┴────┴─────────┴───────┴────────┴─────────┴────────┴─────┴───────────┴──────────┘
Here are tcpdump logs:
NODE version - 6.11.2 REDIS version - 3.2.9 PM2 version - 2.4.6
Any help to understand this erratic behaviour regarding the number of connections would be highly appreciated.
Error while connecting to structures DB 180102/072103.855, [log,auth,error] data: {"screen":"browse","error":{"code":"UNCERTAIN_STATE","command":"SELECT","args":[0],"origin":{"code":"ET IMEDOUT","errno":"ETIMEDOUT","syscall":"read"}}
- Aritra Ghosh[log,connection,client,error] message: read ECONNRESET, stack: Error: read ECONNRESET at exports._errnoException (util.js:1020:11) at TCP.onread (net.js:568:26) 171220/111905.037, [log,connection,client,error] message: read ECONNRESET, stack: Error: read ECONNRESET at exports._errnoException (util.js:1020:11) at TCP.onread (net.js:568:26)
- Aritra Ghosh