2
votes

I'm working with a system that's implemented in C++/OpenMP code, and it uses STL and Eigen's data structures all over the place. Algorithmically, the code seems like a great candidate for acceleration with the new Intel MIC (Xeon Phi) cards.

A typical parallel loop in the code looks like this:

#pragma omp parallel for private(i)
    for (i = 0; i < n; ++i) {
        computeIntensiveFunction(some_STL_or_eigen_container[i]);

The above pseudocode runs with reasonable performance, but it'd be great to offload some of it to the Xeon Phi card. Here's my attempt at doing this:

#pragma offload target (mic)    // <---- NEW
#pragma omp parallel for private(i)
    for (i = 0; i < n; ++i) {
        computeIntensiveFunction(some_STL_or_eigen_container[i]);

However, the Intel ICC/ICPC compiler spits out an error like this: error: function "computeIntensiveFunction" called in offload region must have been declared with compatible "target" attribute. It seems that complaints like this appear for functions and data structures that involve STL or Eigen.


Any thoughts on how to get around this?

I'm new to using Xeon Phi (recovering CUDA programmer), so I don't entirely understand the boundaries for "what can be offloaded?"

1
Hmm, it seems like the flag -offload-attribute-target=mic might be part of the solution here.solvingPuzzles

1 Answers

3
votes

You need something like:

void __attribute__((target(mic))) computeIntensiveFunction(std::vector<sometype> myvar);

defined in your source. This defines the function on the MIC side of things so that it can be called from an offload region.