5
votes

I have written a program with AVX intrinsics, which works well using Ubuntu 12.4 LTS and GCC 4.6 with the following compilation line: g++ -g -Wall -mavx ProgramName.cc -o ProgramName

The problem started When i have updated the compiler up to 4.7 and 4.8.1 versions to support the 16-bit AVX2 intrinsics, which is not supported in gcc 4.6

Currently, the updated gcc version compiles both AVX and AVX2 programs properly. However, it gives me the following error when i run the program: Illegal instruction (core dumped), although it was working on gcc 4.6

My question is: what is prefect way to compile and run both AVX and AVX2 intrinsics

2
I compile the AVX2 program using -mavx2 instead of -mavxMROF
Are you sure your processor supports AVX2? Presently only Haswell chips doMarat Dukhan
@MaratDukhan Broadwell is out too you knowharold
AVX2 is not in that list.harold
There are basically 2 solutions: 1) don't use AVX2, or 2) buy a new computer. As a potential 3rd, have someone else run it, but that makes debugging hard and performance tuning even harder.harold

2 Answers

8
votes

If you tell gcc to use AVX2, it will do so, regardless of whether your CPU supports them or not. That can be useful for cross-compiling or for examining gcc's code generation, but it's not particularly helpful for running programs. If your program crashes with an illegal instruction exception, it is most likely that your CPU does not support the AVX2 extension.

On i386 and x86-64 platforms (and in certain other circumstances), you can specify the gcc option -march=native to generate code for the host machines instruction code. The compiled code might not work on another machine with fewer capabilities, but it should allow you to use all the features of your machine.

While -march=native is a good solution for generating executables, it does not actually help much with writing code; you still need to tailor the instrinsics for the target's architecture, and writing code which can take advantage of CPU features without relying on them gets complicated. I don't know of a good C solution, but there are several C++ template frameworks available.

1
votes

Upgrading to gcc 4.8 likely pulled in AVX512, so you would have needed to limit the generated instr mix to ONLY AVX2 for your machine.