MPI_Scatter structs in C

Question

Is there a clean way to use MPI_Scatter to scatter structs that doesn't involve pack or unpack?

Let's say I have a struct like this:

struct Foo
{
   int a;
   int* b;
   int* c;
}

Where a and b are "array" of integers which can be instantiated as follows:

 struct Foo f1;
 f1.a = 0;
 f1.b = malloc(sizeof(int) * 10));
 f1.c = malloc(sizeof(int) * 15));

And I have an array of Foo instances where each of the instance has different size of b and c.

I can define new MPI types for each of these instance and use MPI_Send to send them, but obviously, it's not very smart.

So my question is, does MPI have any built-in support for this?

Yes, it appears that both MPI and SO have support for this. See stackoverflow.com/questions/9864510/… — Craig Estey

Gilles Gilles · Accepted Answer · 2015-10-21T05:03:02

Unfortunately, there is no simple way of transmitting your data through MPI, although there are obviously not so simple ways.

The heart of the problem here is that the data you want to transmit, namely the structure containing data and pointers to other data, isn't self contained: the pointers inside the structure merely reference part of the data you want to transfer, they don't contain it. Therefore, simply creating a MPI structured type with MPI_Type_create_struct() wouldn't permit you to transfer all the data your structure logically contains, only the data it actually contains.

However, you can still do the trick in a few MPI communications, which you can wrap in a function for your convenience. But for it to be workable, you'll have to make sure of a few things:

Try to avoid mallocing your data all over the place. The fewer malloc the better. Instead of that, you can try (if all possible in the context of your code) to allocate one single large array of data corresponding to all your b and/or c fields of your struct Foo, and make pointers b to point to its share of this large array. This will be both simpler from a MPI communication perspective and better performance-wise for your code.
Create an hollow MPI structured type that will contain only the non-dynamic data the structure contains, like a in your case.
Then transmit your data in two phases: first the structures, then the various large arrays containing the individual dynamic fields of the structures.
Finally, on the receiving end, adjust (if needed) the pointers to point to the newly received data.

Here is a full example of how this would work:

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

typedef struct Foo {
    int a;
    int *b;
    int *c;
} Foo;

int main( int argc, char *argv[] ) {
    MPI_Init( &argc, &argv );
    int rank, size;
    MPI_Comm_rank( MPI_COMM_WORLD, &rank );
    MPI_Comm_size( MPI_COMM_WORLD, &size );

    int len = 3;
    Foo *array = malloc( len * sizeof( Foo ) );
    // let's assume for simplicity that each process already knows the sizes of the individual arrays (it would need to be transmitted otherwise)
    int lenB[] = { 1, 2, 3 };
    int lenC[] = { 5, 6, 7 };
    // now we create the data for the arrays
    int lenAllBs = 0, lenAllCs = 0;
    for ( int i = 0; i < len; i++ ) {
        lenAllBs += lenB[i];
        lenAllCs += lenC[i];
    }
    int *BandC = malloc( ( lenAllBs + lenAllCs ) * sizeof( int ) );
    // And we adjust the pointers
    array[0].b = BandC;
    array[0].c = BandC + lenAllBs;
    for ( int i = 1; i < len; i++ ) {
        array[i].b = array[i-1].b + lenB[i];
        array[i].c = array[i-1].c + lenC[i];
    }

    // Now we create the MPI structured type for Foo. Here a resized will suffice
    MPI_Datatype mpiFoo;
    MPI_Type_create_resized( MPI_INT, 0, sizeof( Foo ), &mpiFoo );
    MPI_Type_commit( &mpiFoo );

    // Ok, the preparation phase was long, but here comes the actual transfer
    if ( rank == 0 ) {
        // Only rank 0 has some meaningful data
        for ( int i = 0; i < len; i++ ) {
            array[i].a = i;
            for ( int j = 0; j < lenB[i]; j++ ) {
                array[i].b[j] = 10 * i + j;
            }
            for ( int j = 0; j < lenC[i]; j++ ) {
                array[i].c[j] = 100 * i + j;
            }
        }
        // Sending it to rank size-1
        // First the structure shells
        MPI_Send( array, len, mpiFoo, size - 1, 0, MPI_COMM_WORLD );
        // Then the pointed data
        MPI_Send( BandC, lenAllBs + lenAllCs, MPI_INT, size - 1, 0, MPI_COMM_WORLD );
    }
    if ( rank == size - 1 ) {
        // Receiving from 0
        // First the structure shells
        MPI_Recv( array, len, mpiFoo, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE );
        // Then the actual data
        MPI_Recv( BandC, lenAllBs + lenAllCs, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE );
        // And printing some
        printf( "array[1].a = %d, array[2].b[1] = %d, array[0].c[4]=%d\n", array[1].a, array[2].b[1], array[0].c[4] );
    }

    MPI_Type_free( &mpiFoo );
    free( BandC );
    free( array );

    MPI_Finalize();
    return 0;
}

Compiled with mpicc -std=c99 dyn_struct.c -o dyn_struct, it gives me:

$ mpirun -n 2 ./dyn_struct
array[1].a = 1, array[2].b[1] = 21, array[0].c[4]=4

So as you can see, it's doable, and not too complex once the structure has be created properly. And if the individual sizes for each member data isn't known before the transmission, it will have to be transmitted prior to transmit the actual data, and the receiving buffers and structure will have to be set accordingly prior to receiving.

MPI_Scatter structs in C

2 Answers