11
votes

When I use cancel directive (since OpenMP 4.0) to break parallel loop within parallel for construct, GCC 5.1 warns "'#pragma omp cancel for' inside 'nowait' for construct" for the following snippet.

const int N = 10000;

int main()
{
  #pragma omp parallel for
  for (int i = 0; i < N; i++) {
    #pragma omp cancel for  // <-- here
  }
}

http://coliru.stacked-crooked.com/a/de5c52da5a16c154

For workaround, when I split to parallel + for constructs, GCC accepts the code silently.

int main()
{
  #pragma omp parallel
  #pragma omp for
  for (int i = 0; i < N; i++) {
    #pragma omp cancel for
  }
}

But I don't know why GCC warns the former case, nevertheless the construct has no 'nowait' clause. OpenMP 4.0 API spec also says that parallel for is equal to parallel + for constructs.

2.10.1 Parallel Loop Construct

Description

The semantics are identical to explicitly specifying a parallel directive immediately followed by a for directive.

Is GCC's behavior correct? or something wrong?

1
I think GCC should emit a better error message. Intel compiler throws: "error: cancel for must be closely nested in a for region" for this case (which makes a little bit more sense). Although parallel for and parallel followed by a for are similar, a cancel construct allows only one clause...methinks the compiler reads the clause followed by a cancel and checks what was the enclosing construct, in your first example it is a parallel for and not a for, hence the compiler throws that error. Just my 2 cents.Sayan
GCC gives a warning, Clang gives no warning, and ICC fails to compile in the first case. All three compilers compile without warning in the second case. Interesting.Z boson
@Sayan, I'm not sure why ICC's error message is any better than GCC's warning. Intel seems to think the cancel is not inside a for loop which it clearly is. It seems like this case software.intel.com/en-us/articles/cdiag1159 which the compiler gives the correct error for.Z boson

1 Answers

6
votes

My guess is that your code

  #pragma omp parallel for
  for (int i = 0; i < N; i++) {
    #pragma omp cancel for
  }

is not equivalent to

  #pragma omp parallel
  {
  #pragma omp for
  for (int i = 0; i < N; i++) {
    #pragma omp cancel for
  }
  } //end of parallel region

in the latter case, there would be two barriers: one at the end of the for and one at the end of the parallel region; something equivalent to:

  #pragma omp parallel
  {
      #pragma omp for nowait
      for (int i = 0; i < N; i++) {
          #pragma omp cancel for
      }
      #pragma omp barrier
  } // and here another implicit barrier

but I guess that for optimization purpose, the compilter may try to remove the second unecessary barrier and generates:

  #pragma omp parallel
  {
      #pragma omp for nowait
      for (int i = 0; i < N; i++) {
          #pragma omp cancel for
      }
  }

which is more 'optimal' but has the drawback to warn about having nowait and cancel mixed.