0
votes

I want to write parallel code using openmp and reduction for square addition of matrix(X*X) values. Can I use "2 for loops" after #pragma omp parallel for reduction. if not kindly suggest.

#pragma omp parallel
{
#pragma omp parallel for reduction(+:SqSumLocal)
for(index=0; index<X; index++)
{
  for(i=0; i<X; i++)
  {
  SqSumLocal = SqSumLocal + pow(InputBuffer[index][i],2);
  }
 }
}

Solution: Adding int i under #pragma omp parallel solves the problem.

1

1 Answers

2
votes

The way you've written it is correct, but not ideal: only the outer loop will be parallelized, and each of the inner loops will be executed on individual threads. If X is large enough (significantly larger than the number of threads) this may be fine. If you want to parallelize both loops, then you should add a collapse(2) clause to the directive. This tells the compiler to merge the two loops into a single loop and execute the whole thing in parallel.

Consider an example where you have 8 threads, and X=4. Without the collapse clause, only four threads will do work: each one will complete the work for one value of index. With the collapse clause, all 8 threads will each do half as much work. (Of course, parallelizing such a trivial amount of work is pointless - this is just an example.)