Yes, you can specify multiple targets. The CUDA sample codes give examples of how to do this in a Visual Studio project. The basic idea would be to specify multiple -gencode
switches (on the nvcc compile command line) via VS project settings under project...CUDA...device (this can also be specified on a source file-by-file basis). In Visual Studio, you just specify switch parameters, like:
compute_20,sm_20;compute_30,sm_30;compute_35,sm_35;
and the visual studio cuda-enabled build system will convert that to a sequence of gencode
switches like:
-gencode arch=compute20,code=sm_20 -gencode arch=compute_30,code=sm_30 ...
which the nvcc compiler will recognize and generate separate device code for the various targets specified. This is a fairly complicated subject, so you may want to read about the fatbinary system and nvcc compilation flow in the nvcc manual, or study other questions about it on the cuda tag here on SO like this one.
Anticipating some of your other questions, that are also covered in the nvcc manual:
The CUDA runtime will select the best fit for the actual device, based on the available targets in your fatbinary. If an exact SASS compiled binary exists, it will use that, otherwise it will take the closest PTX object and JIT-compile for the intended device.
The __CUDA_ARCH__
macro exists and is defined in device code. You could use it to specialize device code for various targets, which would give you a tedious mechanism to verify that the CUDA runtime did the expected thing in selection of objects for use.