7
votes

I have a client connect() to a server, and when idle, it times out after a couple hours. I added setsockopt(socket, SOL_SOCKET, SO_KEEPALIVE...) with 1 sec but it didnt make a difference. Any clues on why keepalive wouldnt work? Would it make a difference if I used SOL_TCP instead of SOL_SOCKET? This is on Linux.

2
Define 'times out after a couple of hours'. What is the exact symptom of that?user207421
errno 110 - Connection timed out. I did a tcpdump and dont see keepalive messagesexcalibur
Set TCP_USER_TIMEOUT option may solve the problem.Homaei

2 Answers

13
votes
int val = 1;
setsockopt(socket, SOL_SOCKET, SO_KEEPALIVE, &val, sizeof val)

Just enables keepalives. You will get the default timers for keepalive probes, which you can view with the command:

sysctl net.ipv4.tcp_keepalive_time

Normally the default is couple of hours.

If you want to change the default timers, you could use this:

struct KeepConfig cfg = { 60, 5, 5};
set_tcp_keepalive_cfg(fd, &cfg);

With the helper functions here:

struct KeepConfig {
    /** The time (in seconds) the connection needs to remain 
     * idle before TCP starts sending keepalive probes (TCP_KEEPIDLE socket option)
     */
    int keepidle;
    /** The maximum number of keepalive probes TCP should 
     * send before dropping the connection. (TCP_KEEPCNT socket option)
     */
    int keepcnt;

    /** The time (in seconds) between individual keepalive probes.
     *  (TCP_KEEPINTVL socket option)
     */
    int keepintvl;
};

/**
* enable TCP keepalive on the socket
* @param fd file descriptor
* @return 0 on success -1 on failure
*/
int set_tcp_keepalive(int sockfd)
{
    int optval = 1;

    return setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &optval, sizeof(optval));
}

/** Set the keepalive options on the socket
* This also enables TCP keepalive on the socket
*
* @param fd file descriptor
* @param fd file descriptor
* @return 0 on success -1 on failure
*/
int set_tcp_keepalive_cfg(int sockfd, const struct KeepConfig *cfg)
{
    int rc;

    //first turn on keepalive
    rc = set_tcp_keepalive(sockfd);
    if (rc != 0) {
        return rc;
    }

    //set the keepalive options
    rc = setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPCNT, &cfg->keepcnt, sizeof cfg->keepcnt);
    if (rc != 0) {
        return rc;
    }

    rc = setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPIDLE, &cfg->keepidle, sizeof cfg->keepidle);
    if (rc != 0) {
        return rc;
    }

    rc = setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPINTVL, &cfg->keepintvl, sizeof cfg->keepintvl);
    if (rc != 0) {
        return rc;
    }

    return 0;
}
2
votes

Despite its name, keep-alive is not about keeping connection alive, it is about exchanging packets periodically to make sure that there is a network path between peers. It kills connections that would survive extended network outages while idle.

Due to this behavior, keep-alive should not be used unless there is good reason, like telnet or SSH connections where it is reasonable to kill the session when client gets out of touch.

Most probably it is the server that is closing connection after n hours regardless of keepalive usage, due to some connection handling policy.