1
votes

My experiment showed that I can write to a non-blocking socket just after the connect() call, with no TCP connection established yet, and the written data correctly received by the peer after connection occured (asynchronously). Is this guaranteed on Linux / FreeBSD? I mean, will write() return > 0 when the connection is still in progress? Or maybe I was lucky and the TCP connection was successfully established between the connect() and write() calls?

The experiment code:

int fd = socket (PF_INET, SOCK_STREAM, 0);
fcntl(fd, F_SETFL, O_NONBLOCK)

struct sockaddr_in addr;
memset(&addr, 0, sizeof(addr));
addr.sin_family = AF_INET;
addr.sin_port = htons(_ip_port.port);
addr.sin_addr.s_addr = htonl(_ip_port.ipv4);

int res = connect(fd, (struct sockaddr*)&addr, sizeof(addr));

// HERE: res == -1, errno == 115 (EINPROGRESS)

int r = ::write(fd, "TEST", 4);

// HERE: r == 4

P.S. I process multiple listening and connecting sockets (incoming and outgoing connections) in single thread and manage them by epoll. Usually, when I want to create a new outgoing connection, I call non-blocking connect() and wait the EPOLLOUT (epoll event) and then write() my data. But I noticed that I can begin writing before the EPOLLOUT and get appropriate result. Can I trust this approach or should I use my old fashion approach?

P.P.S. I repeated my experiment with a remote host with latency 170ms and got different results: the write() (just after connect()) returned -1 with errno == EAGAIN. So, yes, my first experiment was not fair (connecting to fast localhost), but still I think the "write() just next to connect()" can be used: if write() returned -1 and EAGAIN, I wait the EPOLLOUT and retry writing. But I agree, this is dirty and useless approach.

2
IIRC for nin-blocking sockets, connect() behaves differently. You should wait (via select or poll) for the socket to become writablewildplasser
Obviously the connect completed between the calls. You got lucky. Don't write code like this. There's really no point in doing a connect in non-blocking mode unless you have multiple aockets. Connect in blocking mode and then switch.user207421
NB There is nothing 'old-fashioned' about your approach using poll(). It is merely correct. Correctness is not a function of time.user207421
Re your edit, the approach you now mention isn't the same one you're asking about.user207421

2 Answers

3
votes

Can I write() to a socket just after connect() call, but before TCP connection established?

Sure, you can. It's just likely to fail.

Per the POSIX specification of write():

[ECONNRESET]

A write was attempted on a socket that is not connected.

Per the Linux man page for write():

EDESTADDRREQ

fd refers to a datagram socket for which a peer address has not been set using connect(2).

If the TCP connect has not completed, your write() call will fail.

0
votes

At least on Linux, the socket is marked as not writable until the [SYN, ACK] is received from the peer. This means the system will not send any application data over the network until the [SYN, ACK] is received.

If the socket is in non-blocking mode, you must use select/poll/epoll to wait until it becomes writable (otherwise write calls will fail with EAGAIN and no data will be enqueued). When the socket becomes writable, the kernel has usually already sent an empty [ACK] message to the peer before the application has had time to write the first data, which results in some unnecessary overhead due to the API design.

What appears to be working is to after calling connect on a non-blocking socket and getting EINPROGRESS, set the socket to blocking and then start to write data. Then the kernel will internally first wait until the [SYN, ACK] is received from the peer and then send the application data and the initial ACK in a single packet, which will avoid that empty [ACK]. Note that the write call will block until [SYN, ACK] is received and will e.g. return -1 with errno ECONNREFUSED, ETIMEDOUT etc. if the connection fails. This approach however does not work in WSL 1 (Windows Subsystem for Linux), which just fails will EPIPE immediately (no SIGPIPE though).

In any case, not much can be done to eliminate this initial round-trip time due to the design of TCP. If the TCP Fast Open (TFO) feature is supported by both endpoints however, and can accept its security issues, this round-trip can be eliminated. See https://lwn.net/Articles/508865/ for more info.