2
votes

One particular hot spot when I profile a code I am working on, is the following loop:

for(int loc = start; loc<end; ++loc)
    y[loc]+=a[offset+loc]*x[loc+d];

where the arrays y, a, and x have no overlap. It seems to me that a loop like this should be easily vectorized, however when I compile using g++ with the options "-O3 -ftree-vectorize -ftree-vectorizer-verbose=1", I get no indication that this particular loop was vectorized. However, a loop occurring just before the code above:

for(int i=0; i<m; ++i)
    y[i]=0;

does get vectorized according to the output. Any thoughts on why the first loop is not vectorized, or how I might be able to fix this? (I am not all that educated on the concept of vectorization, so I am likely missing something quite obvious)

As per Oli's suggestion, turning up the verbosity yields the following notes (while I am usually good at reading compiler warnings/errors/output, I have no idea what this means):

./include/mv_ops.h:89: note: dependence distance  = 0.
./include/mv_ops.h:89: note: accesses have the same alignment.
./include/mv_ops.h:89: note: dependence distance modulo vf == 0 between *D.50620_89 and *D.50620_89
./include/mv_ops.h:89: note: not vectorized: can't determine dependence between *D.50623_98 and *D.50620_89
2
Why don't you work with pointers to a[offset] and x[d] instead? - K-ballo
Thanks for the suggestion K-ballo. Just did what you suggested, which makes things a bit more readable (though it has no effect on my vectorizing). - MarkD
@K-ballo: I doubt it makes a difference. - Oliver Charlesworth
Are those x, y, and a declared in that function as arrays or pointers? - MSN
@Oli Charlesworth: Its just a matter of personal preference. That's why it was a comment and not an answer. - K-ballo

2 Answers

7
votes

You need to tell the compiler that x, y, and a do not overlap. In C/C++ terms that means telling the compiler that those pointers do not alias by declaring them with restrict (or __restrict). gcc is very aggressive about optimizations when it assumes no aliasing, so be careful.

4
votes

One possibility is that the compiler can't guarantee that there are no aliases. In other words, how can the compiler be sure that y, a and x don't overlap in some way?

If you turn the verbosity level up, you may get some extra info.