This question is related to LLVM/clang.
I already know how to compile opencl-kernel-file(.cl) using OpenCL API ( clBuildProgram() and clGetProgramBuildInfo() )
my question is this:
How to compile opencl-kernel-file(.cl) to LLVM IR with OpenCL 1.2 or higher?
In the other words, How to compile opnecl-kernel-file(.cl) to LLVM IR without libclc?
I have tried various methods to get LLVM-IR of OpenCL-Kernel-File.
I first followed the clang user manual.(https://clang.llvm.org/docs/UsersManual.html#opencl-features) but it did not run.
Secondly, I found a way to use libclc.
commands is this:
clang++ -emit-llvm -c -target -nvptx64-nvidial-nvcl -Dcl_clang_storage_class_specifiers -include /usr/local/include/clc/clc.h -fpack-struct=64 -o "$@".bc "$@" <br>
llvm-link "$@".bc /usr/local/lib/clc/nvptx64--nvidiacl.bc -o "$@".linked.bc <br>
llc -mcpu=sm_52 -march=nvptx64 "$@".linked.bc -o "$@".nvptx.s<br>
This method worked fine, but since libclc was built on top of the OpenCL 1.1 specification, it could not be used with OpenCL 1.2 or later code such as code using printf.
And this method uses libclc, which implements OpenCL built-in functions in the shape of new function. You can observe that in the assembly(ptx) of result opencl binary, it goes straight to the function call instead of converting it to an inline assembly. I am concerned that this will affect gpu behavior and performance, such as execution time.
So now I am looking for a way to replace compilation using libclc.
As a last resort, I'm considering using libclc with the NVPTX backend and AMDGPU backend of LLVM.
But if there is already another way, I want to use it.
(I expect that the OpenCL front-end I have not found yet exists in clang)
My program's scenarios are:
- There is opencl kernel source file(.cl)
- Compile the file to LLVM IR
- IR-Level process to the IR
- Compile(using llc) the IR to Binary
- with each gpu targets(nvptx, amdgcn..)
- Using the binary, Run host(.c or .cpp with lib OpenCL) with clCreateProgramWithBinary()
Now, When I compile kernel source file to LLVM IR, I have to include header of libclc(-include option in first one of above command) for compiling built-in functions. And I have to link libclc libraries before compile IR to binary
My environments are below:
- GTX960
- NVIDIA's Binary appears in nvptx format
- I'm using sm_52 nvptx for my gpu. - Ubuntu Linux 16.04 LTS
- LLVM/Clang 5.0.0
- If there is another way, I am willing to change the LLVM version.
Thanks in advice!