2
votes

The documentations for both MPICH and OpenMPI mention that there should be minimal amount of work done before MPI_Init or after MPI_Finilize:

The MPI standard does not say what a program can do before an MPI_INIT or after an MPI_FINALIZE. In the MPICH implementation, you should do as little as possible.

What is the reason behind this?

To me it seems perfectly reasonable for processes to do a significant amount of calculations before starting the communication with each other.

2

2 Answers

3
votes

I believe it was worded in the like that in order to allow MPI implementations that spawn its ranks within MPI_Init. That means not all ranks are technically guaranteed to exist before MPI_Init. If you had opened file descriptors or performed other things with side effects on the process state, it would become a huge mess.

Afaik no major current MPI implementation does that, nevertheless an MPI implementation might use this requirement for other tricks.

EDIT: I found no evidence of this and only remember this from way back, so I'm not sure about it. I can't seem to find the formulation in MPI standard that you quoted from MPICH. However, the MPI standard regulates which MPI functions you may call before MPI_Init:

The only MPI functions that may be invoked before the MPI initialization routines are called are MPI_GET_VERSION, MPI_GET_LIBRARY_VERSION, MPI_INITIALIZED, MPI_FINALIZED, and any function with the prefix MPI_T_.

2
votes

The MPI_Init documentation of MPICH is giving some hints:

The MPI standard does not say what a program can do before an MPI_INIT or after an MPI_FINALIZE. In the MPICH implementation, you should do as little as possible. In particular, avoid anything that changes the external state of the program, such as opening files, reading standard input or writing to standard output.

BTW, I would not expect MPI_Init to do communications. These would happen later.

And the mpich/init.c implementation is free software; you can study its source code and understand that it is initializing some timers, some threads, etc... (and that should indeed happen really early).

To me it seems perfectly reasonable for processes to do a significant amount of calculations before starting the communication with each other.

Of course, but these should happen after MPI_Init (but before some MPI_Send etc).

On some supercomputers, MPI might use dedicated hardware (like InfiniBand, Fibre Channel, etc...) and there might be some hardware or operating system reasons to initialize it very early. So it makes sense to call MPI_Init very early. BTW, it is also given the pointers to main arguments and I guess that it would modify them before further processing by your main. Then the call to MPI_Init is probably the first statement of your main.