I have written a Python script that calls a C-function which is parallelized using OpenMP (variables from Python to C-function were passed using ctypes-wrapper). The C-function works correctly producing the desired output. But I get a segmentation fault at the end of the Python code. I suspect it has something to do with threads spawned by OpenMP since the seg-fault does not occur when OpenMP is disabled.
On the Python side of the code (which calls the external C-function) I have:
...
C_Func = ctypes.cdll.LoadLibrary ('./Cinterface.so')
C_Func.Receive_Parameters.argtypes = (...list of ctypes variable-type ...)
C_Func.Receive_Parameters.restype = ctypes.c_void_p
C_Func.Perform_Calculation.argtypes = ( )
C_Func.Perform_Calculation.restypes = ctypes.c_void_p
and on the C-side, generic form of the function is:
void Receive_Parameters (....list of c variable-type ...)
{
---Take all data and parameters coming from python---
return;
}
void Perform_Calculation ( )
{
#pragma omp parallel default(shared) num_threads(8) private (....)
{
#pragma omp for schedule (static, 1) reduction (+:p)
p+= core_calculation (...list of variables....)
}
return;
}
float core_calculation (...list of variables...)
{
----all calculations done here-----
}
I have following questions and associated confusion:
Does Python have any control in the operation of threads spawned by the OpenMP inside the C-function? The reason I ask this is that the C-function receives pointers to arrays allocated in the heap by Python. Can OpenMP threads perform operations on this array in parallel without bothering about where it was allocated?
Do I need to do anything in the Python code before calling the C-function, say release the GIL to allow OpenMP threads to be spawned in C-function? If yes, how does one do that?
Do I have to release the GIL in the C-function (before OpenMP parallel block)?