Operating System: CentOS 7 Cuda Toolkit Version: 11.0
Nvidia Driver and GPU Info:
NVIDIA-SMI 450.51.05
Driver Version: 450.51.05
CUDA Version: 11.0
GPU: Quadro M2000M
screenshot of nvidia-smi details
I'm very new to cuda programming so any guidance is extremely appreciated. I have a very simple cuda c++ program that computes the sum of two arrays in unified memory on the GPU. However, it appears that the kernel fails to launch due to a cudaErrorNoKernelImageForDevice error. The code is below:
using namespace std;
#include <iostream>
#include <math.h>
#include <cuda_runtime_api.h>
__global__
void add(int n, float *x, float*y){
for (int i = 0; i < n; i++)
y[i] = x[i] + y[i];
}
int main() {
cout << "!!!Hello World!!!" << endl; // prints !!!Hello World!!!
int N = 1<<20;
float *x, *y;
cudaMallocManaged((void**)&x, N*sizeof(float));
cudaMallocManaged((void**)&y, N*sizeof(float));
for(int i = 0; i < N; i++){
x[i] = 1.0f;
y[i] = 2.0f;
}
add<<<1, 1>>>(N, x, y);
cudaGetLastError();
/**
* This indicates that there is no kernel image available that is suitable
* for the device. This can occur when a user specifies code generation
* options for a particular CUDA source file that do not include the
* corresponding device configuration.
*
* cudaErrorNoKernelImageForDevice = 209,
*/
cudaDeviceSynchronize();
float maxError = 0.0f;
for (int i = 0; i < N; i++){
maxError = fmax(maxError, fabs(y[i]-3.0f));
}
cudaFree(x);
cudaFree(y);
return 0;
}
-arch=sm_50in your compile command. If you have something like-arch=sm_60that would explain why you are getting this failure. - Robert Crovellasm_52, so if you didn't provide any architecture switches on the command line, that would also cause this sort of problem. - Robert Crovella_52in the compile command line are incorrect for your GPU. The IDE has a selection when you are setting up the project (or in the project properties) to change the architecture you are compiling for. You want_50not_52. - Robert Crovella