0
votes

My goal is to establish a continuous and robust TCP connection between one server and exactly one client. If one side fails, the other one should wait until it recovers.

I wrote the following code based on this question (that only asks for continuous, but not robust TCP connections and does not handle keepalive issues), this post and my own experience.

I have two questions:

  1. How can I make the keepalive work? If the server dies, the client only recognizes it after trying to send() - which worked also without the KEEPALIVE option as this results in a connection reset. Is there some way that the socket sends an interrupt for a connection that is dead or some keepalive function that I can check on a regular basis?

  2. Is this a robust way of handling a continous TCP connection? Having a stable, continous TCP connection seems to be a standard problem, however, I couldn't find tutorials covering this in detail. There must be some best-practice.

Note, I could handle keep alive messages on my own at the application level. However, as TCP already implements this at transport level, it is better to rely on this service provided by the lower level.

The server:

from socket import *
serverPort = 12000

while True:
    # 1. Configure server socket
    serverSocket = socket(AF_INET, SOCK_STREAM)
    serverSocket.setsockopt(SOL_SOCKET, SO_REUSEADDR, 1)
    serverSocket.bind(('127.0.0.1', serverPort))
    serverSocket.listen(1)
    print("waiting for client connecting...")
    connectionSocket, addr = serverSocket.accept()
    connectionSocket.setsockopt(SOL_SOCKET, SO_KEEPALIVE,1)
    print(connectionSocket.getsockopt(SOL_SOCKET,SO_KEEPALIVE))
    print("...connected.")
    serverSocket.close() # Destroy the server socket; we don't need it anymore since we are not accepting any connections beyond this point.

    # 2. communication routine
    while True:
        try:
            sentence = connectionSocket.recv(512).decode()
        except ConnectionResetError as e:
            print("Client connection closed")
            break
        if(len(sentence)==0): # close if client closed connection
            break 
        else:
            print("recv: "+str(sentence))

    # 3. proper closure
    connectionSocket.shutdown(SHUT_RDWR)
    connectionSocket.close()
    print("connection closed.")

The client:

from socket import *
import time

while True:
    # 1. configure socket dest.
    serverName = '127.0.0.1'
    serverPort = 12000
    clientSocket = socket(AF_INET, SOCK_STREAM)
    try:
        clientSocket.setsockopt(SOL_SOCKET, SO_KEEPALIVE,1)
        clientSocket.connect((serverName, serverPort))
        print(clientSocket.getsockopt(SOL_SOCKET,SO_KEEPALIVE))
    except ConnectionRefusedError as e:
        print("Server refused connection. retrying")
        time.sleep(1)
        continue

    # 2. communication routine
    while(1):
        sentence = input('input sentence: ')
        if(sentence == "close"):
            break
        try:
            clientSocket.send(sentence.encode())
        except ConnectionResetError as e:
            print("Server connection closed")
            break

    # 3. proper closure
    clientSocket.shutdown(SHUT_RDWR)
    clientSocket.close()

I tried to hold this example as minimal as possible. But given the requirement of robustness, it is relativley long.

I also tried some socket options as TCP_KEEPIDLE, TCP_KEEPINTVL and TCP_KEEPCNT.

Thank you!

2

2 Answers

1
votes

I will try to answer both questions.

  1. ... Is there some way that the socket sends an interrupt for a connection that is dead ...

    I know none. TCP_KEEPALIVE only tries to maintain the connection. It is very useful if any equipment on the network flow has a timeout, because it prevents the timeout to abort the connection. But if the connection drops because because of any other reason (that timeout) TCP_KEEPALIVE cannot do anything. The rationale is that there is no need to restore a dropped inactive connection before something has to be exchanged.

  2. Is this a robust way of handling a continous TCP connection?

    Not really.

    The robust way is to be prepared that the connection fails for any reason at any moment. So you should be prepared to face an error when sending a message (your code is) and if that happens try to re-open the connection and send the message again (your current code does not). Something like:

    def connect(...):
        # establish and return a connection
        ...
        return clientSocket
    
    clientSocket = connect(...)
    while True:
        ...
        while True:
            try:
                clientSocket.send(message)
                break
            except OSError:
                clientSocket = connect()
        ...
    

Unrelated: your graceful shutdown is incorrect. The initiator (the part using shutdown) should not immediately close the socket, but start a read loop and only close when everything has be received and processed.

0
votes

How can I make the keepalive work? If the server dies, the client only recognizes it after trying to send() - which worked also without the KEEPALIVE option as this results in a connection reset.

Keepalive is more useful on the server or reading side. And it is a tricky beast. The socket won't notify you at all unless you read/write. You can query its state (even though I'm not sure this is possible with the standard Python) but this still doesn't solve the problem of notification. You need to check the state periodically anyway.

Is there some way that the socket sends an interrupt for a connection that is dead or some keepalive function that I can check on a regular basis?

Have you ever heard about the Two Generals' Problem? There is no reliable way to detect whether one side is dead or not. We can however be close enough with pings and timeouts.

Note, I could handle keep alive messages on my own at the application level. However, as TCP already implements this at transport level, it is better to rely on this service provided by the lower level.

No, it is not better. If, for any reason, there's a proxy between the server and the client, then no TCP feature will help you. Because by design these only control a single connection, while with a proxy you have at least two connections. You should not think about your connection in terms of the underlying transport (TCP). Instead create your own protocol with ping command which the server (or client or both) send periodically together with timeouts. This way you can be sure that the peer is alive up to period interval.

Is this a robust way of handling a continous TCP connection? Having a stable, continous TCP connection seems to be a standard problem, however, I couldn't find tutorials covering this in detail. There must be some best-practice.

You won't find tutorials covering this, because that problem has no solution. Most people simulate "I'm still alive" with the combination of pings and timeouts.