1
votes

I'm trying to compile a program for Raspberry Pi 2B (ARMv7 / Neon), but I get an error from an inline assembly code:

Error: VFP single precision register expected -- `vstmia.64 r9,{d16-d31}'

The code is:

asm volatile (
        "vstmia.64 %[reg]!, {d0 - d15} @ read all regs\n\t"
        "vstmia.64 %[reg], {d16 - d31} @ read all regs\n\t"
        ::[reg] "r" (&vregs):
);

Funny thing is that it doesn't complain about the first vstmia. I tried with single {d0 - d32} first and I thought maybe there were too many 64-bit registers, but that's obviously not the problem. vregs is a 8-byte aligned storage.

I'm using arm-linux-gnueabihf-gcc 4.8.3, with this command line:

arm-linux-gnueabihf-gcc -mcpu=cortex-a7 -marm -O2 -g -std=gnu11 -MMD -MP -MF"ARM_decode_table.d" -MT"ARM_decode_table.o" -c -o "ARM_decode_table.o" "../ARM_decode_table.c"

1
What -mfpu= option are you passing?Notlikethat
arm-linux-gnueabihf-gcc -mcpu=cortex-a7 -marm -O2 -g -std=gnu11 -MMD -MP -MF"ARM_decode_table.d" -MT"ARM_decode_table.o" -c -o "ARM_decode_table.o" "../ARM_decode_table.c"turboscrew
OK, if you don't specify an FPU you'll get whatever default the compiler was configured with (you can check GCC's configuration with -v). I'm gonna throw out a wild guess that that happens to be vfpv3-d16 ;)Notlikethat
-v gave me (a lot of stuff among which this) --with-fpu=vfp --with-float=hard I should change the fpu to neon-vfpv4?turboscrew

1 Answers

4
votes

By not specifying an appropriate -mfpu option, you get whatever FPU support the compiler's default configuration provides. From your configuration in this case, that is --with-fpu=vfp, which means crusty old VFPv2 with only 16 D registers overlaying the 32 S registers. Thus the first instruction targeting d0-d15 is fine, but the assembler refuses to assemble the second instruction which it knows won't work on the chosen target.

For Cortex-A7 with NEON, -mfpu=neon-vfpv4 will let the toolchain know that it can let rip and use everything you have available.