How to deploy MPI program?

Question

MPI require I deploy mpi program to each machine. Currently, I put the mpi program in nfs, but this method has 2 issues, one is nfs has latency issue and the other is nfs not suitable for large cluster. I know that I could use some linux shell commands to sync up my program to each node, but looks like not so convenient. especially, when I change the program frequently. Is there any easy method to to that ?

If you need to copy the program to each node and have it run from its local filesystem, you need to do so. Otherwise you can run the program and have an MPI_Barrier at the start so all nodes synch up before they start doing their work. — rath
How did you reach to the conclusion that "NFS is not suitable for large cluster"? We run a cluster of almost 1500 nodes and its main filesystem comes form an NFS-shared NetApp filer. — Hristo Iliev
@rath MPI_Init "is a" barrier at the beginning of the program. All ranks must enter MPI_Init before any ranks can leave. Depending on the implementation, exactly how far into MPI_Init ranks must get will vary. Once a rank leaves MPI_Init it can "immediately" begin using the full MPI infrastructure including MPI_COMM_WORLD, and collectives. Depending on the state of the other ranks, there may be some synchronization issues. — Stan Graves

bazza bazza · Accepted Answer · 2013-12-06T06:17:33

There's nothing wrong with NFS or any other network filing system in large clusters. It just means your file server isn't sized for the job. If you replace NFS with anything like ssh, ftp, scripts, or whatever and change nothing else, I don't think that'll make any significant difference. Also, if the loading time is a significant and bothersome component of the overall runtime then why use MPI in the first place?

OK, enough of playing devils advocate. One thing you can do is to have nodes load your program onto other nodes in a binary tree type arrangement. You'll need a script that will copy the executable to two other nodes along with a copy of the script, start that script running asynchronously on those nodes and then runs the executable locally. The result would be a chain reaction of copying and running spreading across the network. The only difficult bit is in choosing which nodes to copy to so that each one is visited just once. It will be a lot faster.

How to deploy MPI program?

3 Answers