I have a program that spawns other processes with execve:
s32 ret = execve( argv[0], argv.data(), (char* const*) req.posixEnv() );
Then later in a loop I call waitpid to watch for when the process terminates:
while( 1 )
{
readOutputFromChildProcess( pid );
int status;
s32 retPid = waitpid( pid, &status, WNOHANG );
if ( retPid < 0 )
{
if ( errno == ECHILD )
{
// I don't expect to ever get this error - but I do. why?
printf( "Process gone before previous wait. Return status lost.\n" );
assert(0);
} else {
// other real errors handled here.
handleError();
break;
}
}
if ( retPid == 0 )
{
waitSomeTime();
continue;
}
processValidResults( status );
break;
}
I have greatly simplified the code. My understanding is that once you spawn a process, the process table entry remains until the caller calls "waitpid" and gets a return value greater than zero, and a valid return status.
But what seems to happen in some cases is that the process terminates on its own, and when I call waitpid, it returns -1, with error ECHILD
ECHILD means that at the time I called waitpid there was no process in the process table with that id. So either my pid was invalid - and I've checked carefully - it is valid.
or - waitpid has already been called after this process finished - in which case I am unable to get the return code from this process.
The program is multi threaded. Also I've check that I'm not calling waitpid too early. It happens after several "waits".
Is there any other way a process table entry gets cleaned up without calling waitpid? How can I make sure that I always get the return code?
@Explicitly ignoring SIGCHLD:
Ok, so I understand that explicitly ignoring it will cause waitpid() to fail. I don't explicitly ignore it, but I do set some signal handlers to trap crashes in another place like so:
void kxHandleCrashes()
{
struct sigaction sa;
sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = abortHandler;
sigemptyset( &sa.sa_mask );
sigaction( SIGABRT, &sa, NULL );
sigaction( SIGSEGV, &sa, NULL );
sigaction( SIGBUS, &sa, NULL );
sigaction( SIGILL, &sa, NULL );
sigaction( SIGFPE, &sa, NULL );
sigaction( SIGPIPE, &sa, NULL );
// Should I add aline like this:
// sigaction( SIGCHLD, &sa, NULL );
}
SIGCHLD? - cnicutar