1
votes

I've done an extensive amount of searching and experimenting trying to solve this on my own, and I am coming up short of an answer.

Background:

I am writing a script using Tcl version 8.4 and the Expect package extension version 5.45. I am learning expect and happened across the "Background Processing" chapter (chapter 17, Exploring Expect). On pages 374-375 it describes the "fork" command. As it turns out, this provided exactly the functionality I was trying to achieve. I need to run the first part of the script, where user input may or may not be required. From that point I want the terminal to return control to the shell (as if it the script process had been backgrounded) while the script continued on to start up another tool and await it's exit before exiting itself. Then once it was complete, the script process should die normally.

The problem:

Everything seems to be working at first glance with my usage of expect's fork command. I get a parent process and a child process. The parent process exits as instructed and the terminal is returned to the shell control. The child process continued on to start up the tool it was supposed to. The problem is once that tool exits, the child process should also exit, but wasn't. The process remained with a "S" or sleep status. FYI, I originally was using the expect command "disconnect" for the child, but I realized that I did not in fact want the child to be disconnected completely. In my experimenting, the disconnect did not seem to have any effect on whether the child process remained after exiting.

The experiment:

I tried to boil this down to the absolute basics. So this is not my script, but a simplified test case that is exhibiting identical behavior. I have to be missing something. This can't be the normal behavior. I need help figuring out exactly what I am doing wrong that is causing this behavior.

The simplified script:

puts "Tcl version   : [info tclversion]"
puts "Expect version: [exp_version]"


while {1} {
   # If forking fails, retry every 10 seconds until it succeeds.
   if {[catch fork child_pid] == 0} {
      break
   }
   sleep 10
}

sleep 10

# Kills the parent process to return terminal control to shell
if {$child_pid != 0} {
   puts "[pid] Parent process exiting..."
   exit
}

# Redefine exit procedure for child so it kills the process for sure on exit
# I have no idea why exit doesn't work for a child process, but this seems to ensure it goes away on exit.
#exit -onexit {
#   puts "[pid] Killing PID..."
#   exec kill [pid]
#}

puts "[pid] Child process sleeping for 10 seconds..."
sleep 10

puts "[pid] Child process waking up and exiting..."
exit

Run output:

:> expect temp_fork
Tcl version   : 8.5
Expect version: 5.45
31483 Child process sleeping for 10 seconds...
31472 Parent process exiting...

:> 31483 Child process waking up and exiting...

In another shell I ran ps u to help show what was going on with the processes at 3 point. First was during the 10 second sleep where both parent and child processes are alive. Second was after the parent process exited, but the child process was sleeping for 10 seconds. Third was after the child process supposed exited, but the process seems to be alive and sleeping for some reason.

1st:

:> ps u | grep expect
user   31472  0.2  0.0   6196  2524 pts/2    Sl+  19:33   0:00 /tools/oss/packages/x86_64-rhel5/expect/default/bin/expect temp_fork
user   31483  0.0  0.0   6196  1456 pts/2    S+   19:33   0:00 /tools/oss/packages/x86_64-rhel5/expect/default/bin/expect temp_fork

2nd:

:> ps u | grep expect
user   31483  0.0  0.0   6196  1576 pts/2    S    19:33   0:00 /tools/oss/packages/x86_64-rhel5/expect/default/bin/expect temp_fork

3rd:

:> ps u | grep expect
user   31483  0.0  0.0   6196  1700 pts/2    S    19:33   0:00 /tools/oss/packages/x86_64-rhel5/expect/default/bin/expect temp_fork
1
In continuing trying to solve this, I came across this: RHEL5 bug I think this had to be the reason, but after having a system admin verify that the Tcl version I am using does not have threading enabled, I am once again stuck with my script remaining as a sleeping process when it should exit normally. - fnJeff
You will also notice in the original post, that I included a commented out section of code redefining the exit handler to exec "kill [pid]" instead of using exit. This does seem to make this work, but it seems like a pretty dirty way of doing this. I just don't see how this behavior would be considered normal, or why no one else has run into this before. That is why I still think I must be doing something wrong. - fnJeff
I also found an ancient forum post describing something very nearly identical, but alas with no solution. - fnJeff

1 Answers

0
votes

You should exit from your child process somewhere in the program after your child process job is done. I don't see you doing it anywhere. I just see exit called on parent process.

while {1} {
   # If forking fails, retry every 10 seconds until it succeeds.
   if {[catch fork child_pid] == 0} {
      # do your job with child process
      exit
   }
   sleep 10
}