6
votes

I'm working with MPI programs on an SMP supercomputer. I would like to identify which processes are sharing the same node, for example by setting an integer key that is equal in all processes on the same node, and different from a node to another. The goal would be then to use this key to split a communicator and have sub-communicators gathering only the processes in the same node.

So the function would look like

int identify_node(MPI_Comm* comm); // returns a key characterizing a node

Assuming a simple distribution of processes like 0,1,2,3 on node_1, 4,5,6,7 on node_2, etc. it is a matter of a simple formula, but I would like to achieve the same result with no assumption on the distribution.

I have an idea how to do that using MPI_Get_processor_name : by computing a hash of the name and assume no two names will get the same hash (I don't like this because if one day I have two names with the same hash, it will be difficult to track the problem), or use some kind of agreement algorithm across processes (which one? I don't know yet).

How would you do that (efficiently if possible)?

Matthieu

1

1 Answers

3
votes

You're right that an assumption on the distribution would be unwise, since rank reordering is actually an up-and-coming technique for improving performance at the cost of that regularity.

A good hashing algorithm on the return value of MPI_Get_processor_name should be pretty safe, but if you want to double-check, you could always gather up the actual names within each group using MPI_Gatherv and compare them directly.

It seems this question addresses the same concerns.