1
votes

The first command, straight from the manual works, but the second does not seem to recognize the .cc file as cuda even though I have -xcuda flag.

clang++ apxy.cu --cuda-gpu-arch=sm_61 -L/usr/local/cuda/lib64 -lcudart_static -ldl -lrt -pthread

clang++ apxy.cc -xcuda --cuda-gpu-arch=sm_61 -L/usr/local/cuda/lib64 -lcudart_static -ldl -lrt -pthread

apxy.cc:3:1: error: unknown type name '__global__'__global__ void axpy(float a, float* x, float* y) {                                                                                                            │        at io.iohk.ethereum.mallet.main.Shell.<init>(Shell.scala:18)
^                                                                                                                                                              │        at io.iohk.ethereum.mallet.main.Mallet$.delayedEndpoint$io$iohk$ethereum$mallet$main$Mallet$1(Mallet.scala:20)
apxy.cc:3:12: error: expected unqualified-id                                                                                                                   │        at io.iohk.ethereum.mallet.main.Mallet$delayedInit$body.apply(Mallet.scala:13)
__global__ void axpy(float a, float* x, float* y) {                                                                                                            │        at scala.Function0.apply$mcV$sp(Function0.scala:34)
           ^                                                                                                                                                   │        at scala.Function0.apply$mcV$sp$(Function0.scala:34)
apxy.cc:17:3: error: use of undeclared identifier 'cudaMalloc'                                                                                                 │        at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
  cudaMalloc(&device_x, kDataLen * sizeof(float));
2

2 Answers

2
votes

My problem was I needed the -xcuda to go before the filename. Here is an example using Bazel.

WORKSPACE

new_local_repository(
    name = "cuda",
    path = "/usr/local/cuda",
    build_file_content = """
cc_library(
    name = "cuda",
    hdrs = glob(["**/*.h", "**/*.hpp", ]),
    includes = ["include/"],
    linkopts = ["-L/usr/local/cuda/lib64 -lcudart_static -ldl -lrt -pthread"],
    visibility = ["//visibility:public"],
)
    """
)

BUILD

cc_binary(
    name = "example",
    srcs = ["example.cu.cc"],
    copts = ["-xcuda --cuda-gpu-arch=sm_61"],
    deps = ["@cuda"],
)
0
votes

It seems that using --std=cuda should do the trick. Caveat: I haven't tried this myself.