2
votes

I'm setting up Autotools for a large scientific code written primarily in C++, but also some CUDA. I've found an example for compiling & linking CUDA code to C code within Autotools, but I cannot duplicate that success with C++ code. I've heard that this is much easier with CMake, but we're committed to Autotools, unfortunately.

In our old hand-written Makefile, we simply use a make rule to compile 'cuda_kernels.cu' into 'cuda_kernels.o' using nvcc, and add cuda_kernels.o to the list of objects to be compiled into the final binary. Nice, simple, and it works.

The basic strategy with Autotools, on the other hand, seems to be to use Libtool to compile the .cu files into a 'libcudafiles.la', and then link the rest of the code against that library. However, this fails upon linking, with a whole bunch of "undefined reference to ..." statements coming from the linker. This seems like it might be a name-mangling issue with g++ vs. the nvcc compiler (which would explain why it works with C code), but I'm not sure what to do at this point.

All .cpp and .cu files are in the top/src directory, and all the compilation is done in the top/obj directory. Here's the relevant details of obj/Makefile.am:

cuda_kernals.cu.o:
    $(NVCC) -gencode=arch=compute_20,code=sm_20 -o $@ -c $<

libcudafiles_la_LINK= $(LIBTOOL) --mode=link $(CXX) -o $@ $(CUDA_LDFLAGS) $(CUDA_LIBS)

noinst_LTLIBRARIES = libcudafiles.la
libcudafiles_la_SOURCES = ../src/cuda_kernels.cu

___bin_main_LDADD += libcudafiles.la
___bin_main_LDFLAGS += -static

For reference, the example which I managed to get working on our GPU cluster is available at clusterchimps.org.

Any help is appreciated!

1
Although it doesn't answer your question directly, do you have the option to use cmake? It is a lot easier to use.Hans Hohenfeld

1 Answers

1
votes

libtool in conjunction with automake currently generates foo.lo (libtool-object metadata) files, the non-PIC (static) object foo.o, and the PIC object .libs/foo.o.

For consistent .lo files, I'd use a rule like:

.cu.lo:
        $(LIBTOOL) --tag=CC --mode=compile $(NVCC) [options...] -c $<

I have no idea if, or how, -PIC flags are handled by nvcc. More options here. I don't know what calls you are making from the program, but are you forward declaring CUDA code with C linkage? e.g.,

extern "C" void cudamain (....);

It seems others have run up against the libtool problem. At worst, you might need a 'script' solution that mimics the .lo syntax and file locations, as described on the clusterchimps site.