5
votes

The problem is to search a password into a big file of size about 10GB using MPI. I divided the file between different processes of chunk size of (Total number of bytes in file / P) where p is the number of processes to create and applying my searching logic in each process through a loop parallely. I want to stop other processes when one process find a solution.

So to abort all other processes i am using following two approaches.

  1. first approach is to call MPI_Abort() function from a process whenever its find solution.
  2. second approach is to use a flag and set it whenever any process find its solution. After setting this flag send it to all the other processes using non-blocking send/recv/Iprobe function. Then check this flag by each process using if(flag == 1) break; and do so..

My first question is which of the above two approach is better and why?
and second one is when i used the second approach i got following msg after completing their execution successfully...

* An error occurred in MPI_Finalize * after MPI was finalized*** MPI_ERRORS_ARE_FATAL (goodbye) [abc:19150] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!

* An error occurred in MPI_Finalize * after MPI was finalized * MPI_ERRORS_ARE_FATAL (goodbye) [abc:19151] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!

1
The first approach is far from elegant, but can possibly be practical for a quick & dirty solution. In order to debug your second approach, you would have to provide a more detailed problem description, e.g. some code, which MPI implementation, setup.Zulan
@Zulan ok sir, from performance point of view which one is better and why error msg are coming in the second approachGopal
@jonathandursi Sir Please give me some clue about this!!!Gopal
the first approach will generally have good performance (no overhead during the search), the second approach can also be implemented with insignificant overhead (it might not be that straight forward). For the error, you need to provide context.Zulan
@Zulan "For the error, you need to provide context" I am not able stop the error. context means?Gopal

1 Answers

2
votes

MPI_Abort is intended for abnormal job termination. The standard says:

int MPI_Abort(MPI_Comm comm, int errorcode)

This routine makes a "best attempt" to abort all tasks in the group of comm. This function does not require that the invoking environment take any action with the error code.

So it really should only be used to bail out of an MPI job as a last resort, not as a normal exit flag.

For the second problem, check if any process is somehow calling MPI_Finalize twice. Also, once MPI_Finalize has been called, no other MPI functions can be used.