I'm having trouble running an OpenMPI program using only two nodes (one of the nodes is the same machine that is executing the mpiexec command and the other node is a separate machine).
I'll call the machine that is running mpiexec, master, and the other node slave.
On both master and slave, I've installed OpemMPI in my home directory under ~/mpi
I have a file called ~/machines.txt on master.
Ideally, ~/machines.txt should contain:
master
slave
However, when I run the following on master:
mpiexec -n 2 --hostfile ~/machines.txt hostname
OUTPUT, I get the following error:
bash: orted: command not found
But if ~/maschines.txt only contains the name of the node that the command is running on, it works. ~/machines.txt:
master
Command:
mpiexec -n 2 --hostfile ~/machines.txt hostname
OUTPUT:
master
master
I've tried running the same command on slave, and changed the machines.txt file to contain only slave, and it worked too. I've made sure that my .bashrc file contains the proper paths for OpenMPI.
What am I doing wrong? In short, there is only a problem when I try to execute a program on a remote machine, but I can run mpiexec perfectly fine on the machine that is executing the command. This makes me believe that it's not a path issue. Am I missing a step in connecting both machines? I have passwordless ssh login capability from master to slave.
~/mpi
, then I am guessing you have added~/mpi
to yourPATH
inside.bashrc
or something. Do not assume that.bashrc
is loaded on each machine that MPI is run. – merlin2011