What is the minimum compute capability for CUDA compilation supported by LLVM compiler?

Question

A CUDA source file can be compiled into PTX format using LLVM compiler with the command clang -Xclang -I$LIBCLC/include/generic -I$LIBCLC/include/ptx -Dcl_clang_storage_class_specifiers -O3 cudaFile.cu -S -o ptxOutputFile.ptx --cuda-gpu-arch=sm_XX

Where sm_XX can be replaced as sm_20, sm_30. For compute capability 1.0, when sm_XX was replaced with sm_10, it gives the error fatal error: cannot open file '/tmp/shared-25f2f5.s': No such file or directory 1 error generated.

So it seems the LLVM has a minimum compute capability of 2.0. Is this assumption correct?

If you are using CUDA:nvcc will use the LLVM-derived backend for targets with compute capability >= 2.0, and the Open64-derived backend for GPUs with compute capability 1.x. Note that support for sm_1x was removed from CUDA (and NVIDIA drivers) some while back. — njuffa

kangshiyin kangshiyin · Accepted Answer · 2016-05-31T08:46:33

It should be correct. As from CUDA 7.0, both the toolkit and driver support for sm_1x has stopped. If sm_20 works, it has to be the minimum.

CUDA Toolkit and CUDA Driver Support for Tesla Architecture The CUDA Toolkit and CUDA Driver no longer supports the sm_10, sm_11, sm_12, and sm_13 architectures. As a consequence, CU_TARGET_COMPUTE_1x enum values have been removed from the CUDA headers.

http://developer.download.nvidia.com/compute/cuda/7_0/Prod/doc/CUDA_Toolkit_Release_Notes.pdf

What is the minimum compute capability for CUDA compilation supported by LLVM compiler?

1 Answers