One reason this can happen, is that firewalls are sometimes configured to find and kill idle connections. The Linux kernel has default TCP "keepalive" settings that it can use to refresh long-running connections. The default values for these settings can be seen using sysctl
:
$ sudo sysctl -a | grep keepalive
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200
In an effort to combat this problem, DataStax recommends adjusting these values in production deployments:
$ sudo sysctl -w \
net.ipv4.tcp_keepalive_time=60 \
net.ipv4.tcp_keepalive_probes=3 \
net.ipv4.tcp_keepalive_intvl=10
You can also add each of those values to your system's equivalent of the/etc/sysctl.conf
file (minus the backslashes) and implement that via sysctl
also:
sudo sysctl -p /etc/sysctl.conf
tail
thesystem.log
files on each, and see why they are going unresponsive. – Aaron