0
votes

I'm using the Eigen library to do some computation on an iPad 2. (ie. cortex-a9). It seems that some operations are vectorized using NEON instructions, while others aren't.

Operations that I've tried that get vectorized: dot products, vector and matrix additions and subtractions.

Operations that don't get vectorized: matrix multiplication.

I'm using these operations inside the same project and same file, so the compiler options are the same. I'm using -O3 -mcpu=cortex-a9 -mfpu=neon -mfloat-abi=softfp.

All matrices that I'm using have Dynamic sizes. Is there anything I'm doing wrong, or is this the expected behaviour?

Thanks.

1
How precisely are you coming to the conclusion that some operations are vectorised and others not? Inspection of the code? Both GCC and Clang emit neon instructions for floating point operations for NEON equipped FP unit on Cortex A-series parts. Also, sure you really want -mfloat-abi-softfp for iOS? This is common in Linux-land where people like building software that is compatible with lots of different ARM arch versions - but with a nasty run-time penalty. Apple opts for fat binaries instead. - marko
I'm using Xcode Instruments to check the assembler code. For a dot product I see a bunch of vadd and vmov, but not for the matrix multiplication. Also, the dot product results in a big improvement over the OpenCV function (roughly 50%), however the matrix multiplication does not. - user1906

1 Answers

0
votes

When you use -mfpu=neon gcc/clang will vectorize integer operations, but not floating-point because NEON is not 100% IEEE-complaint (it doesn't support denormal numbers). You have to specify -ffast-math to make gcc/clang vectorize floating-point code with NEON. However, you must be careful as -ffast-math can affect the numerical results.