3
votes

I am developing a program using the libcurl. The program creates a thread, which in turn makes an HTTP request using libcurl. But sometimes the program crashes with the error

unexpected error 9 on netlink descriptor

After in curl turned off AsynchDNS. But the problem remains. As I understand the reason for assert is getaddrinfo. Maybe to use getaddrinfo in multi-threaded applications some kind of initialization is needed? Or is getaddrinfo generally non-thread safe?

GDB stack trace

libcurl version:

curl 7.67.0 (x86_64-pc-linux-gnu) libcurl/7.67.0 OpenSSL/1.1.0g zlib/1.2.11 libidn2/2.0.4 Release-Date: 2019-11-06 Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp Features: HTTPS-proxy IDN IPv6 Largefile libz NTLM NTLM_WB SSL TLS-SRP UnixSockets

glibc version:

ldd (Ubuntu GLIBC 2.27-3ubuntu1) 2.27 Copyright (C) 2018 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Written by Roland McGrath and Ulrich Drepper.

1
Don't post a screenshot of all that text. Copy and paste the text into the question.1201ProgramAlarm
getaddrinfo() is guaranteed to be thread-safe. In fact, its thread safety is one of the reasons (amongst many) why it is preferred over the gethostby...() functions, which are not guaranteed to be thread-safe.Remy Lebeau
I wanted to copy, but I only had a screenshot. In a couple of days, the program will crash again, and I will be able to copy the full call stack. kirill-782
glib and glibc are two different things, btw.Shawn

1 Answers

1
votes

This is a file descriptor race in the application. The typical scenario for error 9 (EBADF) looks like this:

  1. Thread A closes a file descriptor.
  2. Thread B calls getaddrinfo and opens a Netlink socket. It happens to receive the same descriptor value.
  3. Due to a bug, thread A closes the same file descriptor again. Normally, that would be benign, but due to the concurrent execution, the Netlink socket created by glibc is closed.
  4. Thread B attempts to use the Netlink socket descriptor and receives the EBADF error.

The key to fixing such bugs is figuring out where exactly the double-close happens.