6
votes

I want to bind the threads in my code to each physical core. With GCC I have successfully done this using sched_setaffinity so I no longer have to set export OMP_PROC_BIND=true. I want to do the same thing in Windows with MSVC. Windows and Linux using a different thread topology. Linux scatters the threads while windows uses a compact form. In other words in Linux with four cores and eight hyper-threads I only need to bind the threads to the first four processing units. In windows I set them to to every other processing unit.

I have successfully done this using SetProcessAffinityMask. I can see from Windows Task Manger when I right click on the processes and click "Set Affinity" that every other CPU is set (0, 2, 4, 6 on my eight hyper thread system). The problem is that the efficiency of my code is unstable when I run. Sometimes it's nearly constant but most of the time it has big changes. I changed the priority to high but it makes no difference. In Linux the efficiency is stable. Maybe Windows is still migrating the threads? Is there something else I need to do to bind the threads in Windows?

Here is the code I'm using

#ifdef _WIN32   
HANDLE process;
DWORD_PTR processAffinityMask = 0;
//Windows uses a compact thread topology.  Set mask to every other thread
for(int i=0; i<ncores; i++) processAffinityMask |= 1<<(2*i);        
//processAffinityMask = 0x55;
process = GetCurrentProcess();
SetProcessAffinityMask(process, processAffinityMask);
#else
cpu_set_t  mask;
CPU_ZERO(&mask);
for(int i=0; i<ncores; i++) CPU_SET(i, &mask);      
sched_setaffinity(0, sizeof(mask), &mask);       
#endif

Edit: here is the code I used now which seems to be stable on Linux and Windows

    #ifdef _WIN32   
    HANDLE process;
    DWORD_PTR processAffinityMask;
    //Windows uses a compact thread topology.  Set mask to every other thread
    for(int i=0; i<ncores; i++) processAffinityMask |= 1<<(2*i);
    process = GetCurrentProcess();
    SetProcessAffinityMask(process, processAffinityMask);
    #pragma omp parallel 
    {
        HANDLE thread = GetCurrentThread();
        DWORD_PTR threadAffinityMask = 1<<(2*omp_get_thread_num());
        SetThreadAffinityMask(thread, threadAffinityMask);
    }
    #else
    cpu_set_t  mask;
    CPU_ZERO(&mask);
    for(int i=0; i<ncores; i++) CPU_SET(i, &mask);
    sched_setaffinity(0, sizeof(mask), &mask);
    #pragma omp parallel 
    {
       cpu_set_t  mask;
       CPU_ZERO(&mask);
       CPU_SET(omp_get_thread_num(),&mask);
       pthread_setaffinity_np(pthread_self(), sizeof(mask), &mask); 
    }
    #endif
1
On both platforms your code sets the process affinity mask and not the affinity mask for each individual thread, therefore the scheduler is still free to move the threads among the CPUs allowed by the process affinity mask. - Hristo Iliev
@HristoIliev, I understand what you mean. Do you know how to get the thread handle for each thread created by OpenMP? - Z boson
Inside the parallel region use GetCurrentThread() to obtain the handle of the current thread and assign it an affinity mask with a single bit set based on the result from omp_get_thread_num(). - Hristo Iliev
@HristoIliev, you mean inside a parallel section? I tried that and it returns the same handle for each thread. I'll keep trying... - Z boson
I mean to call it from inside a parallel region so that all OpenMP threads make the call. It should not return the same thread handle in that case unless you are assigning to a shared variable. - Hristo Iliev

1 Answers

1
votes

You should use the SetThreadAffinityMask function (see MSDN reference). You are setting the process's mask.

You can obtain a thread ID in OpenMP with this code:

int tid = omp_get_thread_num();

However the code above provides OpenMP's internal thread ID, and not the system thread ID. This article explains more on the subject:

http://msdn.microsoft.com/en-us/magazine/cc163717.aspx

if you need to explicitly work with those trheads - use the explicit affinity type as explained in this Intel documentation:

https://software.intel.com/sites/products/documentation/studio/composer/en-us/2011Update/compiler_c/optaps/common/optaps_openmp_thread_affinity.htm