0
votes

I am very new to LSF. I have 4 nodes with with 2 sockets per node. Each node is having 8 cores. I have developed hybrid MPI+OpenMP code. I am submitting the job like the following which asks each core to perform one MPI task. So I loose the power of OpenMP.

##BSUB -n 64

I wish to submit the job so that each socket runs one MPI task rather than each core so that the cores inside the socket can be used for OpenMP. How can I build up job submit scripts to optimize the power of the Hybridization in my code.

1

1 Answers

0
votes

First of all, the BSUB sentinels have to be preceded by a single # sign, otherwise they are skipped over as a regular comments.

The correct way to start a hybrid job with older LSF versions is to pass the span resource request and request nodes exclusively. To start a job with 8 MPI processes and 8 OpenMP threads each, you should use the following:

#BSUB -n 8
#BSUB -x
#BSUB -R "span[ptile=2]"

The parameters are as following:

  • -n 8 - requests 8 slots for MPI processes
  • -x - requests nodes exclusively
  • -R "span[ptile=2]" - instructs LSF to span the job over two slots per node

You should request nodes exclusively, otherwise LSF will schedule other jobs to the same nodes since only two slots per node will be used.

Then you have to set the OMP_NUM_THREADS environment variable to 4 (the number of cores per socket), tell the MPI library to pass the variable to the MPI processes, and make the library limit each MPI process to its own CPU socket. This is unfortunately very implementation-specific, e.g.:

Open MPI 1.6.x or older:

export OMP_NUM_THREADS=4
mpiexec -x OMP_NUM_THREADS --bind-to-socket --bysocket ./program.exe

Open MPI 1.7.x or newer:

export OMP_NUM_THREADS=4
mpiexec -x OMP_NUM_THREADS --bind-to socket --map-by socket ./program.exe

Intel MPI (not sure about this one as I don't use IMPI very often):

mpiexec -genv OMP_NUM_THREADS 4 -genv I_MPI_PIN 1 \
        -genv I_MPI_PIN_DOMAIN socket -genv I_MPI_PIN_ORDER scatter \
        ./program.exe