0
votes

I am currently using icc (version 13.1.0.146) to compile C programs running in native mode on the Intel Xeon Phi coprocessor.

Consider the following two code fragments:

// fragment 1
array[pos]     += 1;
array[pos + 1] += 1;
array[pos + 2] += 1;
array[pos + 3] += 1;

// fragment 2
for (int i = 0; i < 4; ++i)
    array[i] += 1;

Unfortunately, only the loop is vectorized automatically. However, if i compile for the x86 platform, icc also vectorizes the "unrolled" version.

Is there a way to tell icc to vectorize basic blocks when compiling for the Xeon Phi, too?

Any help is appreciated. Thanks in advance!

1

1 Answers

0
votes

The transformation which you are looking for here is "Loop materialization" which creates short running loops (number of iterations) from basic blocks and the loop body is very small. So generally are not good candidates for vectorization on Intel(R) Xeon Phi(TM) Coprocessor. This is because we want a significant workload in the loop body so that overhead of creating vector operands doesn't show up significantly in the overall execution time of the loop.