cudaMemPrefetchAsync() returns cudaErrorInvalidDevice - why?

Question

Whenever I call cudaMemPrefetchAsync() it returns the error code cudaErrorInvalidDevice. I am sure that I pass right device id (I have only one CUDA-capable GPU in my laptop under id == 0).

I believe that code sample posted below is error-free, but at line 52 (call to cudaMemPrefetchAsync()) I keep getting this error.

I tried:

Clean driver installation. (Latest version)
I check Google for an answer, but I could not find any. (I managed only to find this)

(I haven't idea for anything else)

System Spec:

OS: Microsoft windows 8.1 x64 home.
IDE: Visual studio 2015
CUDA toolkit: 8.0.61
NVIDIA GPU: GeForce GTX 960M
NVIDIA GPU driver: ver 381.65 (latest)
Compute Capability: 5.0 (Maxwell)
Unified Memory support: is supported.
Intel integrated gpu: Intel HD graphics 4600

Code Sample:

/////////////////////////////////////////////////////////////////////////////////////////////////////////
// TEST AREA:
// -- INCLUDE: 
/////////////////////////////////////////////////////////////////////////////////////////////////////////

// Cuda Libs: ( Device Side ):
#include <cuda_runtime.h>
#include <device_launch_parameters.h>

// Std C++ Libs:
#include <iostream>
#include <iomanip>
///////////





/////////////////////////////////////////////////////////////////////////////////////////////////////////
// TEST AREA:
// -- NAMESPACE:
/////////////////////////////////////////////////////////////////////////////////////////////////////////
using namespace std;
///////////





/////////////////////////////////////////////////////////////////////////////////////////////////////////
// TEST AREA:
// -- START POINT:
/////////////////////////////////////////////////////////////////////////////////////////////////////////
int main() {

    // Set cuda Device:
    if (cudaSetDevice(0) != cudaSuccess)
        cout << "ERROR: cudaSetDevice(***)" << endl;

    // Array:
    unsigned int size = 1000;
    double * d_ptr = nullptr;

    // Allocate unified memory:
    if (cudaMallocManaged(&d_ptr, size * sizeof(double), cudaMemAttachGlobal) != cudaSuccess)
        cout << "ERROR: cudaMallocManaged(***)" << endl;

    if (cudaDeviceSynchronize() != cudaSuccess)
        cout << "ERROR: cudaDeviceSynchronize(***)" << endl;

    // Prefetch:
    if(cudaMemPrefetchAsync(d_ptr, size * sizeof(double), 0) != cudaSuccess)
        cout << "ERROR: cudaMemPrefetchAsync(***)" << endl;

    // Exit:
    getchar();
}
///////////

the documentation says the device must have the cudaDevAttrConcurrentManagedAccess attribute be non-zero. Have you checked that? — talonmies
Well I have now and it says that I don't have a supoort for this: (cudaDeviceProp)devProp.concurrentManagedAccess == 0. But from what I understand this is a feature that was introduced for compute capability == 6.0 (Pascal) and cudaMemPrefetchAsync(***) was first introduced for devices with compute capability == 3.0 (Kepler). Therefore I still should be able to use it even though i am just 5.0 — PatrykB
This api call is for pascal GPUs. You don't have a pascal GPU, and the call would be redundant anyway because pre-pascal UM will automatically migrate managed data to the GPU at kernel launch — Robert Crovella

PatrykB PatrykB · Accepted Answer · 2017-04-15T20:15:17

Thank to talonmies I have realized that my GPU does not support prefetch feature. In order to be able to use cudaMemPrefetchAsync(***) gpu must have non-zero value in (cudaDeviceProp)deviceProp.concurrentManagedAccess.

See more here.

cudaMemPrefetchAsync() returns cudaErrorInvalidDevice - why?

I tried:

System Spec:

Code Sample:

1 Answers