I have an application running on Solaris 8 (SunOS 5.8 Generic_108528-27 sun4u sparc SUNW,Sun-Fire-880) and it's running good for several days until recently it crashed. There was a watchdog module which restarted the application when it crashed. However, it run and crashed again and again. After examined the core dumps, I found that it crashed on the system function calls such as poll, write and send. I examined the contents of the variables passed to the functions and they looked good. I have no idea how to troubleshoot this. Anyone can help to give some guidance on where proceed? Thanks in advance.
Below shows one of the core dump examples on poll:
bash$ gdb applx applx.core
GDB is free software and you are welcome to distribute copies of it
under certain conditions; type "show copying" to see the conditions.
There is absolutely no warranty for GDB; type "show warranty" for details.
GDB 4.16 (sparc-sun-solaris2.5), Copyright 1996 Free Software Foundation, Inc...
warning: exec file is newer than core file.
Core was generated by `applx -h'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libsocket.so.1...done.
Reading symbols from /usr/lib/libnsl.so.1...done.
Reading symbols from /usr/lib/libgen.so.1...done.
Reading symbols from /usr/lib/libc.so.1...done.
Reading symbols from /usr/lib/libdl.so.1...done.
Reading symbols from /usr/lib/libmp.so.2...done.
Reading symbols from /usr/platform/SUNW,Sun-Fire-880/lib/libc_psr.so.1...done.
#0 0xff219ec4 in _libc_poll ()
(gdb) bt
#0 0xff219ec4 in _libc_poll ()
#1 0xff1cccac in _select ()
#2 0x1cf08 in loop () at /home/ian123/applx/src/task.c:1450
#3 0x1e0d4 in state_start (local=0) at /home/ian123/applx/src/state.c:1047
#4 0x1a0f4 in main (argc=537600, argv=0x83400)
at /home/ian123/applx/src/main.c:578
(gdb) up
#1 0xff1cccac in _select ()
(gdb) up
#2 0x1cf08 in loop () at /home/ian123/applx/src/task.c:1450
1450 r = select(maxfd, rfdsp, wfdsp, efdsp, tvp);
(gdb) p maxfd
$1 = 23
(gdb) p rfdsp
$2 = (fd_set *) 0xb8020
(gdb) p wfdsp
$3 = (fd_set *) 0x0
(gdb) p efdsp
$4 = (fd_set *) 0x0
(gdb) p tvp
$5 = (struct timeval *) 0xb81a0
(gdb) p *rfdsp
$6 = {fds_bits = {7610424, 0 }}
(gdb) p *tvp
$7 = {tv_sec = 0, tv_usec = 380002}