0
votes

I'm seeing an MPI_ERR_TRUNCATE error with boost::mpi when performing multiple isend/irecv transfers with the same tag using serialized data. These are not concurrent transfers, i.e. no threading is involved. There is just more than one transfer outstanding at the same time. Here's a short test program that exhibits the failure:

#include <iostream>
#include <string>
#include <vector>
#include <boost/mpi.hpp>
#include <boost/serialization/string.hpp>

static const size_t N = 2;

int main() {
   boost::mpi::environment env;
   boost::mpi::communicator world;

#if 1
   // Serialized types fail.
   typedef std::string DataType;
#define SEND_VALUE "how now brown cow"
#else
   // Native MPI types succeed.
   typedef int DataType;
#define SEND_VALUE 42
#endif

   DataType out(SEND_VALUE);
   std::vector<DataType> in(N);
   std::vector<boost::mpi::request> sends;
   std::vector<boost::mpi::request> recvs;
   sends.reserve(N);
   recvs.reserve(N);

   std::cout << "Multiple transfers with different tags\n";
   sends.clear();
   recvs.clear();
   for (size_t i = 0; i < N; ++i) {
      sends.push_back(world.isend(0, i, out));
      recvs.push_back(world.irecv(0, i, in[i]));
   }
   boost::mpi::wait_all(sends.begin(), sends.end());
   boost::mpi::wait_all(recvs.begin(), recvs.end());

   std::cout << "Multiple transfers with same tags\n";
   sends.clear();
   recvs.clear();
   for (size_t i = 0; i < N; ++i) {
      sends.push_back(world.isend(0, 0, out));
      recvs.push_back(world.irecv(0, 0, in[i]));
   }
   boost::mpi::wait_all(sends.begin(), sends.end());
   boost::mpi::wait_all(recvs.begin(), recvs.end());

   return 0;
}

In this program I first do 2 transfers on different tags, which works fine. Then I attempt 2 transfers on the same tag, which fails with:

libc++abi.dylib: terminating with uncaught exception of type boost::exception_detail::clone_impl >: MPI_Unpack: MPI_ERR_TRUNCATE: message truncated

If I use a native MPI data type so that serialization is not invoked, things seem to work. I get the same error on MacPorts boost 1.55 with OpenMPI 1.7.3, and Debian boost 1.49 with OpenMPI 1.4.5. I tried multiple transfers with the same tag directly with the API C interface and that appeared to work, though of course I can only transfer native MPI data types.

My question is whether having multiple outstanding transfers on the same tag is a valid operation with boost::mpi, and if so is there a bug in my program or a bug in boost::mpi?

2

2 Answers

4
votes

At the current version of boost, 1.55, boost::mpi does not guarantee non-overtaking messages. This in contrast to the underlying MPI API which does:

Order Messages are non-overtaking: If a sender sends two messages in succession to the same destination, and both match the same receive, then this operation cannot receive the second message if the first one is still pending. If a receiver posts two receives in succession, and both match the same message, then the second receive operation cannot be satisfied by this message, if the first one is still pending. This requirement facilitates matching of sends to receives. It guarantees that message-passing code is deterministic, if processes are single-threaded and the wildcard MPI_ANY_SOURCE is not used in receives.

The reason boost::mpi does not guarantee non-overtaking is that serialized data types are transferred in two MPI messages, one for size and one for payload, and irecv for the second message cannot be posted until the first message is examined.

A proposal to guarantee non-overtaking in boost::mpi is being considered. Further discussion can be found on the boost::mpi mailing list beginning here.

0
votes

The problem could be that you're waiting for all of your sends to complete and then for all of your receives. MPI is expecting your sends and receives to match in time as well as in number. What I mean when I say that is that you can't finish all of your send calls without also having your receive calls progressing.

The way MPI usually handles sending a message is that when you call send, it will return from the call as soon as the message is handled by the library. This could that the message has been copied to an internal buffer or that the message was actually transferred to the remote process and has been received. Either way, the message has to go somewhere. If you don't have a receive buffer already waiting, the message has to be buffered internally. Eventually, the implementation will run out of those buffers and starts to do bad things (like return errors to the user), which you are probably seeing here.

The solution is to pre-post your receive buffers. In your case, you can just push all of isend and irecv calls into the same vector and let MPI handle everything. That will give MPI access to all of the receive buffers so your messages have somewhere to go.