0
votes

The question is if this procedure is correct?

Lets say I want to split all p processes in MPI_COMM_WORLD into several non-overlapping communicators. I do this by determining the m groups on one rank master. The master rank determines which processes need communication and which don't (analyzing a graph basically).

At the end the master rank has determined these m groups of process ranks with which I want to build m communicators. Lets say I send these results of these m rank groups to all processes.

As I've found out is that all processes in MPI_COMM_WORLD need to make all these groups with the MPI_Group_? commands (even if they dont belong to the group, (stupidly ?)) and afterwards create each communicator with MPI_Comm_create collectively.

The question now is: What does a process do with all the communicators to which it does not belong and which are not needed by this process, but they were needed to call m-times the collective MPI_Comm_create function?

Can we simply neglect all the MPI_Comm pointers and only store the one to which this process belongs?, no need to call MPI_Comm_free? How should I free the communicators if I want to rebuild another set of communicators? Should a process only free its communicator to which it belongs?

Could somebody explain if this procedure is correct or not? I am a bit unsure about this collective creation routines...

Thanks a lot!

1

1 Answers

3
votes

It sounds like you might be doing things the hard way.

The best way to split a large communicator into many non-overlapping subcommunicators is to use MPI_COMM_SPLIT. The prototype looks like this:

int MPI_Comm_split(MPI_Comm comm, int color, int key, MPI_Comm *newcomm)

Everyone that will be in the same subcommuniator contributes the same color. Then the key is used to arrange the ranks within the subcommunicator. You can use the original rank if you want to maintain ordering.

The only thing you'll need to do before you call MPI_COMM_SPLIT is to broadcast the group numbers from the master process to everyone else using MPI_BCAST.