2
votes

I want a TCP server that waits for clients to connect, and as soon as they do, sends them some data continuously. I also want the server to notice if a client disappears suddenly, without a trace, and to remove them from the list of open sockets.

My code looks like this:

#!/usr/bin/env python3

import select, socket 

# Listen Port
LISTEN_PORT = 1234

# Create socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Setup the socket
server.setblocking(0)
server.bind(('0.0.0.0', LISTEN_PORT))
server.listen(5)

# Make socket reusable
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

# Setup TCP Keepalive
server.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
server.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 1)
server.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 3)
server.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPCNT, 5)

# Tell user we are listening
print("Listening on port %s" % LISTEN_PORT)

inputs = [server]
outputs = []

while True:
    # Detecting clients that disappeared does NOT work when we ARE
    # watching if any sockets are writable
    #readable, writable, exceptional = select.select(inputs, outputs, inputs)

    # Detecting clients that disappeared works when we aren't watching
    # if any sockets are writable
    readable, writable, exceptional = select.select(inputs, [], inputs)

    for s in readable:
        if s is server:
            connection, client_address = s.accept()

            print("New client connected: %s" % (client_address,))

            connection.setblocking(0)

            inputs.append(connection)
            outputs.append(connection)
        else:
            try:
                data = s.recv(1024)
            except TimeoutError:
                print("Client dropped out")

                inputs.remove(s)

                if s in outputs:
                    outputs.remove(s)
                    continue

            if data:
                print("Data from %s: %s" % (s.getpeername(), data.decode('ascii').rstrip()))
            else:
                print("%s disconnected" % (s.getpeername(),))

    for s in writable:
        s.send(b".")

As you can see, I'm using TCP Keepalive to allow me to see if a client has disappeared. The problem I'm seeing is this:

  • when I'm NOT having select() watch for writeable sockets, when the client disappears, select() will stop blocking after the TCP Keepalive timeout expires, and the socket will be in the readable list, so I can remove the client that disappeared from input and output (which is good)
  • when I AM having select() watch for writable sockets, when the client disappears, select() will NOT stop blocking after the TCP Keepalive timeout expires, and the client socket never ends up in the readable or writable list, so it never gets removed

I'm using telnet from a different machine as a client. To replicate a client disappearing, I'm using iptables to block the client from talking to the server while the client is connected.

Anyone know what's going on?

1
As you are sending continuously, you don't need TCP keepalive at all. You will eventually get a connection reset if a client disappears. And you don't need to select for writeability before writing. Just write, and only use select() if that produces EAGAIN/EWOULDBLOCK. NB Using iptables does not simulate a vanished client.user207421
1. Even a simple, single-threaded python app that waits for a single client to connect and writes to it non-stop doesn't notice that a client disappeared after 5 minutes (I just tried). 2. iptables makes sure that a client has no chance of signalling the server it disconnected. It's as good as it gets short of pulling the network cable 3. If I don't use select() to check for writability, I end up only checking for new connections, which looks like this, in which case, select.select will block, preventing writes. Am I missing something?John
I didn't say anything about within five minutes, but it will certainly happen. You can write any time you like. You don't need select() 'permission`, except in the case I mentioned.user207421

1 Answers

1
votes

As the comments to your question have mentioned, the TCP_KEEPALIVE stuff won't make any difference for your use-case. TCP_KEEPALIVE is a mechanism for notifying a program when the peer on the other side of its TCP connection has gone away on an otherwise idle TCP connection. Since you are regularly sending data on the TCP connection(s), the TCP_KEEPALIVE functionality is never invoked (or needed) because the act of sending data over the connection is already enough, by itself, to cause the TCP stack to recognize ASAP when the remote client has gone away.

That said, I modified/simplified your example server code to get it to work (as correctly as possible) on my machine (a Mac, FWIW). What I did was:

  1. Moved the socket.setsockopt(SO_REUSEADDR) to before the bind() line, so that bind() won't fail after you kill and then restart the program.

  2. Changed the select() call to watch for writable-sockets.

  3. Added exception-handling around the send() calls.

  4. Moved the remove-socket-from-lists code into a separate RemoveSocketFromLists() function, to avoid redundant code

Note that the expected behavior for TCP is that if you quit a client gently (e.g. by control-C'ing it, or killing it via Task Manager, or otherwise causing it to exit in such a way that its host TCP stack is still able to communicate with the server to tell the server that the client is dead) then the server should recognize the dead client more or less immediately.

If, on the other hand, the client's network connectivity is disconnected suddenly (e.g. because someone yanked out the client computer's Ethernet or power cable) then it may take the server program several minutes to detect that the client has gone away, and that's expected behavior, since there's no way for the server to tell (in this situation) whether the client is dead or not. (i.e. it doesn't want to kill a viable TCP connection simply because a router dropped a few TCP packets, causing a temporary interruption in communications to a still-alive client)

If you want to try to drop the clients quickly in that scenario, you could try requiring the clients to send() a bit of dummy-data to the server every second or so. The server could keep track of the timestamp of when it last received any data from each client, and force-close any clients that it hasn't received any data from in "too long" (for whatever your idea of too long is). This would more or less work, although it risks false-positives (i.e. dropping clients that are still alive, just slow or suffering from packet-loss) if you set your timeout-threshold too low.

#!/usr/bin/env python3

import select, socket

# Listen Port
LISTEN_PORT = 1234

# Create socket
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

# Make socket reusable
server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

# Setup the socket
server.setblocking(0)
server.bind(('0.0.0.0', LISTEN_PORT))
server.listen(5)

# Tell user we are listening
print("Listening on port %s" % LISTEN_PORT)

inputs  = [server]
outputs = []

# Removes the specified socket from every list in the list-of-lists
def RemoveSocketFromLists(s, listOfLists):
   for nextList in listOfLists:
      if s in nextList:
         nextList.remove(s)

while True:
   # Detecting clients that disappeared does NOT work when we ARE
   # watching if any sockets are writable
   readable, writable, exceptional = select.select(inputs, outputs, [])

   for s in readable:
      if s is server:
         connection, client_address = s.accept()

         print("New client connected: %s" % (client_address,))
         connection.setblocking(0)
         inputs.append(connection)
         outputs.append(connection)
      else:
         try:
            data = s.recv(1024)
            print("Data from %s: %s" % (s.getpeername(), data.decode('ascii').rstrip()))
         except:
            print("recv() reports that %s disconnected" % s)
            RemoveSocketFromLists(s, [inputs, outputs, writable])

   for s in writable:
      try:
         numBytesSent = s.send(b".")
      except:
         print("send() reports that %s disconnected" % s)
         RemoveSocketFromLists(s, [inputs, outputs])