i am looking for a MPI function/method that allows to deliver multiple data blocks from one process to all others. Similar to MPI_Bcast but with multiple blocks at the same time?
I have a fragmented data block on the root rank:
#define BLOCKS 5
#define BLOCKSIZE 10000
char *datablock[BLOCKS];
int i;
for (i=0; i<BLOCKS; i++) datablock[i] = (char*)malloc(BLOCKSIZE*sizeof(char))
this is just an example but it is clear that the the BLOCKS are not necessary adjacent. I want this datablock delivered to all other ranks (where i allready have prepared the necessary memory to store it).
I noticed that there are methods like MPI_Gatherv or MPI_Scatterv which allow to gather or scatter fragmented data using a displacement array, the problem is that scatter sends each fragment to a different rank, i need to send all fragments to all other ranks, something like a MPI_Bcast with displacment information, like MPI_Bcastv.
One solution would be to have multiple MPI_Bcast calls (one for each block) but i am not sure if this is the best way to do this.
UPDATE: I will try the MPI_Ibcast method here is want i think should work:
int rank; // rank id
int blocksize = 10000;
int blocknum = 200;
char **datablock = NULL;
char *recvblock = NULL;
MPI_Request *request;
request = (MPI_Request *)malloc(blocknum*sizeof(MPI_Request));
if(rank == 0) {
// this is just an example in practice those blocks are created one the fly as soon as the last block is filled
datablock = (char**)malloc(blocknum*sizeof(char*));
for (i=0; i<BLOCKS; i++) datablock[i] = (char*)malloc(blocksize*sizeof(char));
for (i=0; i<blocknum; i++)
MPI_Ibcast(datablock[i], blocksize, MPI_CHAR, 0, MPI_COMM_WORLD, request[i]);
} else {
// for this example the other threads know allreay how many blocks the rank 0 has created, in practice this information is broadcasted via MPI before the MPI_Ibcast call
recvblock = (*char)malloc(blocksize*blocknum*sizeof(char));
for (i=0; i<blocknum; i++)
MPI_Ibcast(recvblock+i*(blocksize), blocksize, MPI_CHAR, 0, MPI_COMM_WORLD, request[i]);
}
MPI_Waitall(blocknum, request, MPI_STATUSES_IGNORE);
So, there is a MPI_Waitall missing, i am not sure how to use it, there is a count, an array of requests and an array of statuses needed!?
The reason i have a different MPI_Ibcast for the root and the other ranks is, that the send buffer is not identical with the receive buffer.
Another question is, do i need a different request for each of the MPI_Ibcasts in the for loops, or can i reuse the MPI_request variable as i have done in the example above?
UPDATE2: So i have updated the example, i use a MPI_Request pointer now! Which i then initialize by the malloc call right after the defintion, this seems pretty odd i guess but this is just an example and in practice the number of requests needed is only known during runtime. I am especially worried if i can use sizeof(MPI_Request) here or if this is problematic because this is not standard data type?
Apart from that, is the example correct? Is it a good solution if i want to use MPI_Ibcast?