How to transmit vectors of different types in MPI using one broadcast

Question

I have some vectors that I want to transmit between processes using MPI in C++. I have been using broadcast to exchange the data one vector at a time, but it doesn't seem to scale up well with the number of processes. I was told that this is because I am using too many broadcast calls and it causes too much lag as the number of processes increases. Therefore, it was recommended to me to pack all the data in one structure, and send them all in one go.

My problem is that my vectors are of type int and double (2 vectors of each type), and their length is not known at compile time (it's equal to the number of processes). Therefore, I cannot create a struct to hold all the data and then broadcast that using a custom MPI structure (MPI_Type_create_struct) as described here because it requires to hardcode the sizes of the c-arrays at compile time. I know how to create the MPI custom structure, but I can't figure out how to pack the data into a struct in order to send it.

So, to sum up, is there a way to broadcast 4 vectors of different types (2 int and 2 double) between processes using only one broadcast?

MPI structured datatypes are generic and don't really need to correspond to C/C++ structs. — Hristo Iliev
That's true, but then how can I use that to actually send the data? As an example, I could use : MPI_Send(&custom_object_to_send, 1, mpi_custom_type, dest, tag, MPI_COMM_WORLD); but I need to create custom_object_to_send and assign values to it so that I can send it, and, as far as I can tell, this needs to be buildable in the language. — Nikos Kazazakis
No, you don't. You simply create an MPI structure type and specify the absolute address of the beginning of each array/vector as the offset, then specify MPI_BOTTOM as the data buffer address in the call to MPI_Send. Just use your favourite search engine to search for "MPI_Type_create_struct" "MPI_BOTTOM". — Hristo Iliev

Torbjörn Torbjörn · Accepted Answer · 2016-07-25T19:25:21

Note: Numbers such as "4.1.2" refer to chapters in the MPI standard v3.0

This should be possible with MPI_Type_create_struct (4.1.2) and without an actual C-struct.

Use MPI_Type_vector (4.1.2) to create two data types. One for a vector of int and one for a vector of double as you'll get from vector<>::data(). You can assume, that vector<> stores the data in a continuous way.
You can feed MPI_Type_vector with the size values at runtime.

Now use MPI_Type_create_struct to combine two instances of each of the two vector types into a single datatype, which you then can easily use for communication.

With MPI_Get_address (4.1.5) we can get the MPI-friendly address of a variable at runtime. You'll probably need that during construction of the derived datatype for the virtual struct. (Read the advices to the user in the MPI standard on why not using &)

For more flexibility, omit the intermediate vector data types and directly compute the addresses and extends of the vector<>::data() arrays and feed that directly into MPI_Type_create_struct.

Disclaimer: IIRC, I've been using this approach successfully for sending multiple non-continuous blocks of continuous data with single messages (some time ago).

How to transmit vectors of different types in MPI using one broadcast

1 Answers