0
votes

My OpenMP (C) parallel code is slower than the serial bitonic sort code and although I have done all the possible combinations I could think of, I can't find the problem.

Here is part of the code:

void recBitonicSortOpenMP( int lo, int cnt, int dir ) {
   if (  cnt >  1 ) {
         int k = cnt / 2;
         #pragma omp parallel sections 
         {

         #pragma omp section 
         recBitonicSortOpenMP( lo,     k, ASCENDING );

         #pragma omp section    
         recBitonicSortOpenMP( lo + k, k, DESCENDING );

         }
         bitonicMergeOpenMP( lo, cnt, dir );

   }

}

It is an implementation of Bitonic Sort.

2
For a proper explanation, the question requires a minimal reproducible example, system description, and actual performance observation. Generally - user tasks for parallelizing recursion.Zulan

2 Answers

2
votes

For recursive problems like this use OpenMP tasks, not nested parallelism and sections. http://cs.umw.edu/~finlayson/class/fall16/cpsc425/notes/24-sorting.html gives you a good example to follow...

0
votes

You should try to parallelize the imperative version of the algorithm. I think the recursive version is an inherently serial problem.

There are many pragmas useful to parallelize "For" cycles.

http://www.openmp.org/specifications/

Two version of the algorithm:

https://www.cs.duke.edu/courses/fall08/cps196.1/Pthreads/bitonic.c