1
votes

I have an openmp code written in C. I executed the code on Intel MIC on Stampede. I want to profile the code to find the hotspots in the code so that it will be helpful for me to optimize the code further. I tried to use the profiler gprof but I read somewhere that gprof cannot be used on MIC directly. I tried to use perf by going through tutorial. I could go till a certain step after which when the perf annotate step comes and I execute the code, it gives me the error ")" unexpected. So I am not knowing how to proceed to profile my code. Can anybody please help ?? This is the site where I referred to the perf tutorial : sandsoftwaresound.net/perf/perf-tutorial-hot-spots/ .

2

2 Answers

1
votes

80% of optimization for the Xeon Phi is the same as for the host (Xeon). Use gprof, printf, compiler options, and the rest of your toolkit and carry your optimization as far as you can executing your code on the host only. After you can do no more, then focus on specific Xeon Phi optimizations.

As you are on Stampede, I assume you are using the Intel compiler. The compiler has a lot of diagnostic capabilities to profile your code and even provide suggestions. I'd provide you with more specific URLs but am on vacation with limited bandwidth.

Though this isn't specific to your question, here are some other suggestions. If you aren't, you'll most likely get a substantial boost using it. Intel compilers are danged good at optimizations, especially on Intel architectures. Also, you should use Intel MKL where possible. All of MKL's routines are optimized for the different IA architectures, and the most relevant to HPC are optimized specifically for MIC.

0
votes

You have a few options.

The heavyweight approach is to use Intel Vtune. Firstly add -g to your compiler flags. I use Vtune from the host command line quite a bit, here is the command I use to profile an application on the MIC. (This is executed on the host machine, Vtune on the host uses ssh to launch the application on the MIC.)

amplxe-cl -collect knc-hotspots -source-search-dir=/mysrc/dir -search-dir=/mybin/dir -- ssh mic0 /home/me/myapp 

Assume the app on the MIC is at /home/me/myapp, and the source dir and source search dir on the host. (With Vtune update 15 at least, I need to specify both of these separately in order to get the Vtune GUI to show me symbol info)

Once your app has finished, run the Vtune GUI on the host with amplxe-gui and open your result set.

There are also some simplified open source profiling tools developed by Intel that support the MIC, Speedometer and Overhead, you can find information about them here

Hopefully this is enough info to get you started.