I need to implement some parallel computing functionality for some computationally demanding c++ code. I have read that a combination of MPI and OpenMP can be used to achieve what I need - MPI can be used to distribute tasks between processors and OpenMP is used to distribute tasks between threads on individual processors.
I typed lscpu
(see below) to check the processor details of my office PC but I am not sure how to interpret it. The key points appear to be the following:
- 12 CPU(s)
- 1 Socket
- 6 Core(s) per socket
- 2 Thread(s) per core
So how do I interpret this in terms of possibilities for parallelization? Specifically, how do MPI and OpenMP correspond to the items in this list? Is MPI used to distribute across the 12 CPUs, and then OpenMP across the 2 threads? But then what about cores and sockets?
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
Stepping: 10
CPU MHz: 4409.872
CPU max MHz: 4700,0000
CPU min MHz: 800,0000
BogoMIPS: 7392.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 12288K
NUMA node0 CPU(s): 0-11