I'm trying to use OpenMP to make some code parallel.
omp_set_num_threads( 8 );
#pragma omp parallel
for (int i = 0; i < verSize; ++i)
{
#pragma omp single nowait
{
neighVec[i].index = i;
mesh.getBoxIntersecTets(mesh.vertexList->at(i), &neighVec[i]);
}
}
verSize is about 90k, and getBoxIntersecTets is quite expensive. So I expect the code to fully utilize a quad core cpu. However the CPU usage is only about 25%. Any ideas?
I also tried using omp parallel for construct, but same story.
getBoxIntersecTets uses STL unordered_set, vector and deque, but I guess OpenMP should be agnostic about them, right?
Thanks.