For detecting instruction set features there are two source files I reference:
- Mysticial's cpu_x86.cpp
- Agner Fog's instrset_detect.cpp
Both of these files will tell you how to detect SSE through AVX2 as well as XOP, FMA3, FMA4, if your OS supports AVX and other features.
I am used to Agner's code (one source file for MSVC, GCC, Clang, ICC) so let's look at that first.
Here are the relevant code fragments from instrset_detect.cpp
for detecting AVX:
iset = 0; // default value
int abcd[4] = {0,0,0,0}; // cpuid results
cpuid(abcd, 0); // call cpuid function 0
//....
iset = 6; // 6: SSE4.2 supported
if ((abcd[2] & (1 << 27)) == 0) return iset; // no OSXSAVE
if ((xgetbv(0) & 6) != 6) return iset; // AVX not enabled in O.S.
if ((abcd[2] & (1 << 28)) == 0) return iset; // no AVX
iset = 7; // 7: AVX supported
with xgetbv
defined as
// Define interface to xgetbv instruction
static inline int64_t xgetbv (int ctr) {
#if (defined (_MSC_FULL_VER) && _MSC_FULL_VER >= 160040000) || (defined (__INTEL_COMPILER) && __INTEL_COMPILER >= 1200) // Microsoft or Intel compiler supporting _xgetbv intrinsic
return _xgetbv(ctr); // intrinsic function for XGETBV
#elif defined(__GNUC__) // use inline assembly, Gnu/AT&T syntax
uint32_t a, d;
__asm("xgetbv" : "=a"(a),"=d"(d) : "c"(ctr) : );
return a | (uint64_t(d) << 32);
#else // #elif defined (_WIN32) // other compiler. try inline assembly with masm/intel/MS syntax
//see the source file
}
I did not include the cpuid
function (see the source code) and I removed the non GCC inline assembly from xgetbv
to make the answer shorter.
Here is the detect_OS_AVX()
from Mysticial's cpu_x86.cpp
for detecting AVX:
bool cpu_x86::detect_OS_AVX(){
// Copied from: http://stackoverflow.com/a/22521619/922184
bool avxSupported = false;
int cpuInfo[4];
cpuid(cpuInfo, 1);
bool osUsesXSAVE_XRSTORE = (cpuInfo[2] & (1 << 27)) != 0;
bool cpuAVXSuport = (cpuInfo[2] & (1 << 28)) != 0;
if (osUsesXSAVE_XRSTORE && cpuAVXSuport)
{
uint64_t xcrFeatureMask = xgetbv(_XCR_XFEATURE_ENABLED_MASK);
avxSupported = (xcrFeatureMask & 0x6) == 0x6;
}
return avxSupported;
}
Mystical apparently came up with this solution from this answer.
Notice that both source files do basically the same thing: check the OSXSAVE bit 27, check the AVX bit 28 from CPUID, check a result from xgetbv
.
sysctl -a | grep -i avx
to see CPU features. – Mark Setchell