So I open a udp socket (SOCK_DGRAM) and bind it. After sending traffic, the socket is closed. The same code is used to create a socket and bind (same address port). The call fails with errno 98 (address already in use). For TCP that makes sense to me since it goes in the TIME_WAIT state, but for UDP? Why does this happen?
To note, this only happens when the server is busy sending at high rates (e.g. 10Gb/s). The socket is wrapped in a c++ class providing RAII. Code below:
Socket::Socket(uint32_t srcIp, uint16_t port)
{
fd = socket(AF_INET, SOCK_DGRAM, 0);
sockaddr_in addr = {};
addr.sin_addr.s_addr = srcIp;
addr.sin_port = port;
addr.sin_family = AF_INET;
socklen_t len = sizeof(addr);
if (bind(fd, reinterpret_cast<sockaddr*>(&addr), len) != 0) {
// throw error
}
}
Socket::~Socket()
{
close(fd);
}
Interestingly, after the socket is destructed and recreated, the socket(AF_INET, SOCK_DGRAM, 0) call returns the same value for the file descriptor. I think that indicates that the OS has recycled the FD already and has processed the close. Yet, it doesn't like the bind. Weird to me that for UDP bind would behave that way since it is connectionless.
I don't want to use SO_REUSEADDR because I don't want to bind to a port already in use. I want to know if I already have a socket listening on that port. Or if that is the only way, then how can I know if the socket has been closed and the UDP is in whatever state it is in (can it even go in TIME_WAIT state?). The nature of this issue is a race condition that happens fast enough that I can't query netstat to see what state the zombie socket is in, because by then it will have disappeared.
I do set a gdb breakpoint on the destructor and then on the '// throw error' line. I see the destructor called, so I know close was called; and then the next breakpoint to hit is in the constructor (with the bind failed).