A CUDA source file can be compiled into PTX format using LLVM compiler with the command clang -Xclang -I$LIBCLC/include/generic -I$LIBCLC/include/ptx -Dcl_clang_storage_class_specifiers -O3 cudaFile.cu -S -o ptxOutputFile.ptx --cuda-gpu-arch=sm_XX
Where sm_XX can be replaced as sm_20, sm_30. For compute capability 1.0, when sm_XX was replaced with sm_10, it gives the error fatal error: cannot open file '/tmp/shared-25f2f5.s': No such file or directory
1 error generated.
So it seems the LLVM has a minimum compute capability of 2.0. Is this assumption correct?
nvcc
will use the LLVM-derived backend for targets with compute capability >= 2.0, and the Open64-derived backend for GPUs with compute capability 1.x. Note that support forsm_1x
was removed from CUDA (and NVIDIA drivers) some while back. – njuffa