0
votes

I have a problem where one or more threads lock each other. I dont know what going on there. The debugger cannot break (thread 1), breaks but cannot get a backtrace (thread 2+5) or shows the backtrace (thread 3)

Debug view in eclipse. Gdb native shows the same.

I learned that this is case because libc imlements this in assembler an gdb cannot walt the stack correctly. Sometimes (i dont know when), i can do a few steps in the assembly, then i see the backtrace.

I just tried a x64 program and it works.

See my sample code:

#include <time.h>

int main()
{
    while(1)
    {
        struct timespec ts;
        ts.tv_sec = 1;
        ts.tv_nsec = 0;

        clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, 0);
    }
    return 1;
}

gdb output 32 bit:

vagrant@PC41388-spvm-4650:/tmp$ gdb main32

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from main32...(no debugging symbols found)...done.

(gdb) r Starting program: /tmp/main32 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". ^C Program received signal

SIGINT, Interrupt. 0x55579cd9 in ?? () (gdb) bt

#0 0x55579cd9 in ?? ()

#1 0x555b0af3 in __libc_start_main (main=0x80484dd , argc=1, argv=0xffffcee4, init=0x8048520 <__libc_csu_init>, fini=0x8048590 <__libc_csu_fini>, rtld_fini=0x55564160 <_dl_fini>, stack_end=0xffffcedc) at libc-start.c:287

#2 0x08048401 in _start () (gdb)

gdb output 64 bit:

vagrant@PC41388-spvm-4650:/tmp$ gdb main64

GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from main64...(no debugging symbols found)...done.

(gdb) r Starting program: /tmp/main64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". b ^C Program received signal SIGINT, Interrupt. 0x00002aaaaafe092a in __clock_nanosleep (clock_id=1, flags=0, req=0x7fffffffdc10, rem=0x2aaaaafe092a <__clock_nanosleep+58>) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:41 41 ../sysdeps/unix/sysv/linux/clock_nanosleep.c: No such file or directory.

(gdb) bt

#0 0x00002aaaaafe092a in __clock_nanosleep (clock_id=1, flags=0, req=0x7fffffffdc10, rem=0x2aaaaafe092a <__clock_nanosleep+58>) at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:41

#1 0x0000000000400630 in main () (gdb)

set architecture i386 does not help either.

More news: info proc mapp shows the x32 app is in [vvar] whereas the x64 app is at libc. This would explain why gdb cant find the backtrace.

So my question is: Is there a different version of the libc, where this works? I am using ubuntu14.04.

1
Unless you are debugging system libraries I don't really get why you would want to step through them?tofro
I dont want to step through, but i want to see the call stack. 0x55579cd9 does not tell me wether i am in a sem_wait, clock_nanosleep or in a pthread_join, for example.kuga
Yes, debugging deadlock is horrible. But don't try to debug syscalls, but try to debug your thread logic. The error is there. Maybe first reduce the number of threads to just 2. If that works, add a thread until it fails. Then start reasoning about what could happen (or print debug info to console or file).Paul Ogilvie
The problem here is, the deadlock is timing relevant. Program works fine when i step through and probably also when i insert prints.kuga

1 Answers

0
votes

I updated to a newer gdb version (currently the latest, 7.12.1). This fixed the problem.

Note that gbd:i386 did not work either on lubuntu x64, whereas it worked fine under lubuntu x32. Also note that both main32 and libc are binary identical on lubuntu x64 and x32.