First of all, the BSUB
sentinels have to be preceded by a single #
sign, otherwise they are skipped over as a regular comments.
The correct way to start a hybrid job with older LSF versions is to pass the span
resource request and request nodes exclusively. To start a job with 8 MPI processes and 8 OpenMP threads each, you should use the following:
#BSUB -n 8
#BSUB -x
#BSUB -R "span[ptile=2]"
The parameters are as following:
-n 8
- requests 8 slots for MPI processes
-x
- requests nodes exclusively
-R "span[ptile=2]"
- instructs LSF to span the job over two slots per node
You should request nodes exclusively, otherwise LSF will schedule other jobs to the same nodes since only two slots per node will be used.
Then you have to set the OMP_NUM_THREADS
environment variable to 4
(the number of cores per socket), tell the MPI library to pass the variable to the MPI processes, and make the library limit each MPI process to its own CPU socket. This is unfortunately very implementation-specific, e.g.:
Open MPI 1.6.x or older:
export OMP_NUM_THREADS=4
mpiexec -x OMP_NUM_THREADS --bind-to-socket --bysocket ./program.exe
Open MPI 1.7.x or newer:
export OMP_NUM_THREADS=4
mpiexec -x OMP_NUM_THREADS --bind-to socket --map-by socket ./program.exe
Intel MPI (not sure about this one as I don't use IMPI very often):
mpiexec -genv OMP_NUM_THREADS 4 -genv I_MPI_PIN 1 \
-genv I_MPI_PIN_DOMAIN socket -genv I_MPI_PIN_ORDER scatter \
./program.exe