I am trying to send message to all MPI processes from a process and also receive message from all those processes in a process. It is basically an all to all communication where every process sends message to every other process (except itself) and receives message from every other process.
The following example code snippet shows what I am trying to achieve. Now, the problem with MPI_Send is its behavior where for small message size it acts as non-blocking but for the larger message (in my machine BUFFER_SIZE 16400) it blocks. I am aware of this is how MPI_Send behaves. As a workaround, I replaced the code below with blocking (send+recv) which is MPI_Sendrecv. Example code is like this MPI_Sendrecv(intSendPack, BUFFER_SIZE, MPI_INT, processId, MPI_TAG, intReceivePack, BUFFER_SIZE, MPI_INT, processId, MPI_TAG, MPI_COMM_WORLD, MPI_STATUSES_IGNORE)
. I am making the above call for all the processes of MPI_COMM_WORLD inside a loop for every rank and this approach gives me what I am trying to achieve (all to all communication). However, this call takes a lot of time which I want to cut-down with some time-efficient approach. I have tried with mpi scatter and gather to perform all to all communication but here one issue is the buffer size (16400) may differ in actual implementation in different iteration for MPI_all_to_all function calling. Here, I am using MPI_TAG to differentiate the call in different iteration which I cannot use in scatter and gather functions.
#define BUFFER_SIZE 16400
void MPI_all_to_all(int MPI_TAG)
int size;
int rank;
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
int* intSendPack = new int[BUFFER_SIZE]();
int* intReceivePack = new int[BUFFER_SIZE]();
for (int prId = 0; prId < size; prId++) {
if (prId != rank) {
MPI_Send(intSendPack, BUFFER_SIZE, MPI_INT, prId, MPI_TAG,
for (int sId = 0; sId < size; sId++) {
if (sId != rank) {
MPI_Recv(intReceivePack, BUFFER_SIZE, MPI_INT, sId, MPI_TAG,
I want to know if there is a way I can perform all to all communication using any efficient communication model. I am not sticking to MPI_Send, if there is some other way which provides me what I am trying to achieve, I am happy with that. Any help or suggestion is much appreciated.
? – AlexGMPI_Send
is blocking. I suspect what you're referring to is buffering, but not sure. You shouldn't use point-to-point communication instead of global communication because it's most likely going to be less efficient (except if you're planning on something special/dedicated but keep in mind it's tricky). Keeping in mind the referenced post, what is it actually that you want to do? Just send around and overwrite arrays? Or something else? – atruMPI_Bcast()
) but your snippet keeps overwriting the receive buffer. Can you please update your code to clarify what you are trying to achieve ? It seemsMPI_Allgather()
is a fit here. – Gilles Gouaillardet