2
votes

As far as I know, the only reason to wait for a ACK has to do with the transmit window getting exhausted. Or maybe slow-start. But then this fragment of a Wireshark dump over a pre-existing TCP socket doesn't make sense to me:

enter image description here

Here, between the packets 38 and 40, the server (45.55.162.253) waits a full RTT before continuing sending. I changed the RTT through Netem to be sure that delay is alway equal to the RTT, and as you can see, there is no application data flowing from client to server that the server might need "to continue working". But there is a very conspicuous ACK packet going from the client (packet 39) without any payload. The advertised window is a lot larger than [SEQ/ACK analysis]/[Bytes in flight], which is 1230.

My question is: is there something in TCP that is triggering this wait for ACK between packet 38 and 40 by the server?

1

1 Answers

3
votes

TCP limits its transmission rate according to two separate mechanisms:

  1. Flow Control, which is there to make sure that the sender doesn't overwhelm the other party with data. This is where the receive window comes in. Since the receive windows advertised by the client in your screenshot are large, this isn't what pauses the transfer in your case.

  2. Congestion Control, which tries to make sure that the network isn't overwhelmed. Slow Start, which you've mentioned, is part of this mechanism in some implementations of TCP, specifically TCP Tahoe and TCP Reno, which are the variants most commonly taught in networking courses although rarely used in practice.

Since we know that flow control is not what's pausing the connection, we can assume that the culprit is the congestion control algorithm. To figure out the exact cause however, you'd need to dive into the implementation details of TCP your OS uses. For windows, it seems to be something called Compound TCP. With recent Linux kernels, it's something called TCP CUBIC, described in this whitepaper.

The important thing to note however is that both mechanisms operate during the entire lifetime of the connection, not just its start. It seems that your sender paused after sending its biggest packet so far (at least among the ones shown in the screenshot), so it possible that this packet consumed its remaining free congestion control window, and although the flow control window was still large, it was bound the former.