Vectorization of loops in OpenMP

Question

I am writing a program in C (a 2d poisson solver) and I am using openMP to speed-up a big for loop. What I observed is that inside an openMP parallel block, the for loop is not vectorized even in the case where I include the #pragma always vector directive. For the compilation I am using the pathscale compiler.

The code I want to vectorize looks like this :

  #pragma omp parallel shared(in, out, lambda,dim,C) private(k)
     {
            #pragma omp for  schedule(guided,dim/nthreads)  nowait  
            for(k = 0;k < dim; k++){
                    in[k]  = C*out[k]*lambda[k];
            }
    }

where out,lambda and in are double precision arrays.

But even if I include #pragma always vector, what the compiler answers is :

 warning: ignoring #pragma always vector

Do you know if there is any workaround for this?

Thanks.

I'm tempted to think that you won't get much from vectorizing/parallelizing that loop. There's so little work for a lot of memory access. — Mysticial
@Mysticial, this was part of an entry for a contest and it did help. :) — Konstantinos

Alexey Kukanov Alexey Kukanov · Accepted Answer · 2012-03-11T21:02:46

I looked through the User Guide for the PathScale compiler, and did not find neither #pragma always nor #pragma vector. So I think the compiler just tells you that it does not recognize this pragma, and ignores it.

However in section 7.4.5 I found the following options that should help you with vectorization:

Vectorization of user code ... is controlled by the flag -LNO:simd[=(0|1|2)], which enables or disables inner loop vectorization. 0 turns off the vectorizer, 1 (the default) causes the compiler to vectorize only if it can determine that there is no undesirable performance impact due to sub-optimal alignment, and 2 will vectorize without any constraints (this is the most aggressive).

-LNO:simd_verbose=ON prints vectorizer information (from vectorizing user code) to stdout.

As a side note (guessing where you could take that #pragma always vector from), Intel's compiler has #pragma vector with always being one possible parameter to the pragma. But pragmas are generally compiler-specific, except for few extensions (OpenMP being one) that are supported by multiple vendors.

Vectorization of loops in OpenMP

1 Answers