I'm running mpi4py 2.0.0 built against OpenMPI 1.10.1 on an Ubuntu 14.04.3 system with Python .7.10. For some reason, attempting to send messages larger than 64 Kb causes the send/recv to hang; however, I'm able to successfully send large messages on other Ubuntu 14 systems using the exact same software and OpenMPI/mpi4py packages. I'm also able to successfully send large messages within C programs that use OpenMPI. This suggests that there is something in the environment that is adversely affecting the MPI communication performed by mpi4py. Any ideas as to what could be interfering with mpi4py?
Here is an example of the code that works on one system and hangs on the other when N is set to 65537 or greater.
import os
import sys
from mpi4py import MPI
import numpy as np
N = 65537
def worker():
comm = MPI.Comm.Get_parent()
size = comm.Get_size()
rank = comm.Get_rank()
buf = np.empty(N, np.byte)
comm.Recv(buf=buf)
if __name__ == '__main__':
script_file_name = os.path.basename(__file__)
if MPI.Comm.Get_parent() != MPI.COMM_NULL:
worker()
else:
comm = MPI.COMM_SELF.Spawn(sys.executable,
args=[script_file_name],
maxprocs=1)
comm.Send(np.random.randint(0, 256, N).astype(np.byte), 0)
I also tried replacing the pickled send/recv with a non-pickled Send/Recv using explicitly specified fixed-length buffers, but that didn't have any effect on the problem.
Curiously, the problem doesn't appear to affect transmissions between peer processes using the same communicator.