0
votes

I'm facing a peculiar situation.

I have a MPI program that creates 16 MPI processes as mpirun -np 16 a.out

Now I want all these 16 processes to run for a fixed time say 60 seconds after which they should all report their results to a common process (say the one with rank 0).

So I will be doing a gather from process with rank 0 after 60 seconds. Now, how do I ensure that all processes stop after 60 seconds?

Pseudocode:

/*All processes (except 0) are doing the following:*/
while (1) {
  MPI_Send (to process 0)
  MPI_Recv (from process 0)
}

/*Process 0 roughly does the following:*/
while(1) {
  MPI_Recv (from any other process)
  Process the request
  MPI_Send (back to clients)
}

/* After 60 seconds, stop all processes and gather results at Process 0. */
1. Catch a SIGALRM signal after 60 secs.
2. Do dummy MPI_Irecv(any source) to ensure that it any client blocking on MPI_Send() is woken up.
3. Now do an MPI_Send to all clients with a special value in buffer telling them to terminate.
4. MPI_gather from all clients.

Process 0 acts like a server and the rest are clients.

I tried using signal handling (SIGALRM) but the documentation says that signal handling is unsafe with MPI.

If signals cannot be used, then how do we handle this?

2
I have been through the above link. I'm catching SIGALRM signal on Process 0 after 60 secs. But, the program randomly deadlocks (some time it works). I'm not sure what's actually happening.paratrooper
You can broadcast the date of the conference in advance using MPI_Bcast() and then let each processor care for his own time table. Just like in real life... Oups, I forgot I had a MPI_Reduce() today !francis
The link tells you Using signals in your MPI application in general is not safe., you still ask why you are having troubles when using signals?Zulan
I know that signals aren't safe but I'm not sure why using signals isn't safe. And if we can't use signals, then how do we handle this scenario?paratrooper

2 Answers

1
votes

I believe it was demonstrated by Leslie Lamport that there is no absolute time in a distributed system. Analogous to special relativity, each process has a relative time from its own standpoint. That said, if you want to stop approximately 60 seconds after the program start (from an external observer standpoint), only one process should monitor the clock and decide to stop.

Considering what you said, process 0 seems to be the ideal candidate to do so. Since you can't use SIGALARM (and I don't believe any other asynchronous method is really appropriate for a tightly synchronous MPI application like yous), my suggestion is to check the system time right after MPI_Recv from your process 0. Subtract time now with the time that process started, if it is greater than 60, process 0 signal to all other processes to stop, via MPI_Send.

EDIT: Now I understood that process 0 responds to each request individually, the procedure should be a little different.

After each MPI_Recv on process 0, check if 60 seconds has elapsed since the beginning of execution. If did, respond current client process to quit, then break out of the loop, and do something like this:

for(int i = 2; i < 16; ++i) {
    MPI_Status s;
    MPI_Recv(buf, count, datatype, MPI_ANY, tag, comm, &s);

    MPI_Send(message_to_quit, count, datatype, s.MPI_SOURCE, tag, comm);
}

This way, process 0 will wait and signal for every other process to quit, before itself quitting.

0
votes

The suggestion given by Ivella has worked. Except that I had to do one more change.

In process 0:

After 60 seconds expiry (computed using gettimeofday) break the while loop and do the following:

  1. Enter into another loop for about 5 seconds where Process 0 would continuously probe using MPI_Iprobe to check whether any client is waiting on MPI_Send.

  2. If MPI_Iprobe sets the flag to true, then issue an MPI_Recv from process 0 to ensure that all clients now come out of MPI_Send and are waiting for a response on MPI_Recv.

  3. At this time, send a special character to each client announcing termination.

  4. Now all processes execute MPI_Reduce with target set to process 0 after which they all terminate.

Pseudocode after 60 seconds expiry:

timeout = 5 secs

while (time < timeout) {

   MPI_Iprobe(any_source, flag,...)

   /*this is to ensure that all waiting clients are unblocked from MPI_Send*/
   if (flag != 0) {
       MPI_Recv(status.MPI_SOURCE);
   }
}

Now, issue MPI_Send to all clients announcing termination, followed by MPI_Reduce (or gather) and exit.