0
votes

I am looking to do some NEON manual code optimization using inline ASM neon instructions inside C++ functions, target is ARM Cortex-A9 (i.MX6Q).

When it comes to making the correct flags for the compiler, I got a bit confused with -mfpu. My goal is to use the hard FPU with floating point operations and use NEON only with ASM code.

  1. Is it safe to assume that by setting -mfpu=vfpv3, the NEON coprocessor is still accessible by calling ASM neon instructions?

  2. By setting -mfpu=neon-fp16, will the FPU core be unused?

  3. Will the FPU outperform NEON when it comes to making non vectorized floating point operations?

1
1) ASM code can access NEON instructions if it's in the processor; compiler doesn't affect that. 2) That option enables hardware half-precision conversion instructions 3) NEON doesn't have 'non-vectorized' instructions, so your question doesn't make sense.BitBank
1) I don't think so, if "-mfloat-abi=soft" compiler will cause code to be executed on an emulated FPU. Code can't decide to where it will be executed. 2) I am talking about first part, that is "neon". The point is, will the hard FPU core will get unused, as from this flag FP will be offloaded to NEON core. 3) can you explain what is a "non-vectorized instruction"? I know about vectorized data. That is, instructions can be executed on any type of data. Am I correct?Malek
1) If you're writing in-line ASM, compiler can't remove NEON instructions. 2) GCC almost never emits NEON instructions; this you will have to test. 3) NEON instructions only operate on 128-bit VECTOR registers. You can't do a single FP operation in NEON, so your question doesn't make sense.BitBank

1 Answers

0
votes

1) No, GCC will pass the -mfpu value to the assembler, and the assembler will refuse to assemble your code, whether you are using inline asm or seperate assembler files:

cat foo.s
    vmov q1, q2
gcc foo.s -c -mfpu=vfpv3
foo.s: Assembler messages:
foo.s:1: Error: selected FPU does not support instruction -- `vmov q1,q2'

2) No, -mfpu=neon-fp16 in GCC also enables the use of instructions from the VFPv3 instruction set.

3) I am not sure what you mean by this question, the scalar versions of floating point instructions are in the various revisions of the VFP instruction sets, and the vector versions are in the NEON (Advanced SIMD) instruction sets.