3
votes

What I need : My perl program is spawning a few external programs. Then it will monitor the external programs and restart them if they fail for any reason. The perl program cannot wait for the processes it started. I do not care about the STDOUT or STDERR of the spawned programs and would like to close any of their output completely

My problems so far : First I don't know much beyond the basics of the process abstraction and process management. Different scenarios of process termination of my perl program and its children will cause 'zombies' (programs that were once children of my perl program will become adopted by init )

using open to create the external child programs

# please note I have simplified the shell commands for this example
$SIG{CHLD}='IGNORE';
open(my $ph, "-|", "./sc_serv --options") or die $!;
open(my $ph, "-|", "(ffmpeg --opt1 --opt2) | vlc --many-options --etc") or die $!;

If you look at the output of pstree -p 6947 below you can see that my perl program created two children processes sc_serv(6948) and sh(6949). sh(6949) <- this shell was created because ffmpeg(6950) was piped into vlc(6952) ( atleast I belive this is why a shell was needed, I am not 100% sure. )

perl(6947)─┬─sc_serv(6948)─┬─{sc_serv}(6964)
           │               ├─{sc_serv}(6965)
           │               └─{sc_serv}(6966)
           └─sh(6949)─┬─ffmpeg(6950)
                      └─vlc(6952)─┬─{vlc}(6980)
                                  ├─{vlc}(6983)
                                  └─{vlc}(6985)

Now if I kill perl(6947) the entire family tree will be properly terminated and cleaned up, though killing sh(6949) will leave its children to init creating a bit of a mess.

Please note that if I do not set $SIG{CHLD} to IGNORE, killing sh(6949) will leave its children to init and will also make sh(6949) a defunct process. I do not understand why this is. so,

Question 1. Why do I have to set $SIG{CHLD} to IGNORE in this situation?

Question 2. How do I make sure that if I need to kill sh(6949) that all of its children will be killed as well?

#

I have also tried using fork along with system or exec. The outcome of fork has been very similar to the outcome of using open. Here is a sub using system I have tried.

sub start_vlc {

  my $pid = fork();
  if( $pid == 0 ) {
    close STDOUT;
    close STDERR;
    system "(ffmpeg --opt1 --opt2) | vlc --many-options --etc";
  }
  else {
    return $pid;
  }

}

Using this sub ends up with almost the same problems as above, but now the children are the cloned perl programs. Killing perl(25991) will leave all of its children to init creating the same mess as above. Also killing perl(25982), the Parent of all of the processes, will not kill all of its children properly. It will kill some but not all, creating more of a mess than my attempts with open.

perl(25982)─┬─perl(25989)───sc_serv(25990)─┬─{sc_serv}(25992)
            │                              ├─{sc_serv}(25993)
            │                              └─{sc_serv}(25994)
            └─perl(25991)───sh(25995)─┬─ffmpeg(25996)
                                      └─vlc(25997)─┬─{vlc}(26042)
                                                   ├─{vlc}(26044)
                                                   └─{vlc}(26046)

Question 3. How can I use fork along with system or exec to spawn external programs and have control over their termination without leaving orphans to init?

Please be as detailed as you would like, I am not opposed to trying modules but I would rather learn how to do this without them to better my understanding of process control with perl.

references :

run process in background without being adopted by init, Perl

How can i get process id of UNIX command i am triggering in a Perl script?

1
Have you tried threads?amphetamachine
@amphetamachine, No I have not tried threads, I have come across them and will look into them more but as of now wouldn't really know how to use threads. Any basic examples of how to achieve my goals using threads?BryanK
fork and open will work similarly, because they're doing the same thing - 'open' is basically doing a fork-exec.Sobrique
@Sobrique, yes I know this but there is some differences in the way you must handle cleaning up children, which is what I am having trouble understanding.BryanK
I do not understand your entire question, but maybe you need to use setsid to you can control your children using process groups.Mark Setchell

1 Answers

2
votes

Question 1. Why do I have to set $SIG{CHLD} to IGNORE in this situation?

When you set that, it tells your child processes not to hang around to be reaped by your parent with wait or waitpid. It means no zombies. Importantly - it's also cascaded down to children when you fork(). So your sh process is inheriting it.

When set to IGNORE your sh will exit immediately when killed, and the children will reparent to init. If you don't set it, then the sh will stay as a zombie, waiting for the parent to reap it with wait() and collect the return code. You may want this, because you parent can then detect the exit condition, and do any clear up it needs to.

Question 2. How do I make sure that if I need to kill sh(6949) that all of its children will be killed as well?

Either: Kill a negative process id, which will send the same signal to the whole tree. Or: Use a signal handler, and trap a signal - like either SIGHUP or SIGUSR1 - and then use that handler to propagate the signal to the appropriate child processes. Potentially a combination of both. (See: Best way to kill all child processes)

Question 3. How can I use fork along with system or exec to spawn external programs and have control over their termination without leaving orphans to init?

What are you trying to avoid? All 'reparenting' a process will do, is mean that init will clean them up when they exit and go zombie, automatically. (Which you can avoid by setting $SIG{'CHLD'})

As someone's mentioned in the comments - it may be worth looking at threads rather than fork. It's a different model of IPC - it's probably less efficient, but makes certain types of programming a lot easier and clearer - specifically any that use shared memory type operating models.