OpenCV - copy CUDA device data into GPU Mat

Question

Is there a way to directly copy previously allocated CUDA device data into an OpenCV GPU Mat? I would like to copy my data, previously initialized and filled by CUDA, into the OpenCV GPU mat. I would like to do so because I want solve a linear system of equations Ax = B by computing the inverse of the matrix A using OpenCV.

What I want to do is something like this:

float *dPtr; 
gpuErrchk( cudaMalloc( (void**) &dPtr, sizeof(float) * height * width));    
gpuErrchk( cudaMemset(dPtr, 0, sizeof(float) * height * width));

// modify dPtr in some way on the GPU 
modify_dPtr(); 

// copy previously allocated and modified dPtr into OpenCV GPU mat? 

// process GPU mat later - e.x. do a matrix inversion operation. 

// extract raw pointer from GPU mat

EDIT: The OpenCV documentation provides a GPU upload function.

Can the device pointer just be passed into that function as a parameter? If not, is there no other way to do such a data transfer? I don't want to copy data back and forth between the host and device memory, do my computation on a normal OpenCV Mat container, and copy back the results; my application is real-time. I am assuming that since there is no .at() function for a GPU Mat, as in the normal OpenCV Mat, there is no way to access the element at a particular location in the matrix? Also, does an explicit matrix inversion operation exist for the GPU Mat? The documentation does not provide a GPU Mat inv() function.

@Downvoter: Why the downvote? I did my research in terms of asking the question, and my question is a valid question because my CUDA algorithm runs on the GPU. I want to interface the GPU module of OpenCV with my code; that way, I don't have to waste time copying to and fro from device to host memory. There is no clear cut interface to copy previously allocated CUDA device memory to an OpenCV GPU Mat container, which is in contrast to the general ease of use of the OpenCV Mat container with CUDA. I posted the question because other knowledgeable people might know an answer to my problem. — Eagle
Whether OpenCV can do it or not (and I guess it can't), solving a set of linear equations by explicitly calculating the inverse is almost always the wrong thing to do. — talonmies
stackoverflow.com/questions/25512354/… I think that goes the other way around openCV->CUDA but it may be a place to start. — Christian Sarofeen
@talonmies I agree, but he didn't ask for a good linear solver method. Just how to implement an inefficient one. — Christian Sarofeen
No I mean instantiate a GpuMat with your pointer. Look in the header definition and you will find a constructor for that, I believe — talonmies

Eagle Eagle · Accepted Answer · 2015-02-21T22:39:08

Just as talonmies posted in the comments, there is a constructor in the header of the GPU mat structure that allows the creation of a GPUMat header pointing to my previously allocated CUDA device data. This is what I had used:

cv::gpu::GpuMat dst(height, width, CV_32F, d_Ptr);

There is no need to figure out the step size because the constructor automatically evaluates it, given the width and height of the image. Hopefully, when the support for OpenCV GPU functions becomes better, this post may be useful to someone.

EDIT

Another (probably) useful way is to utilize unified memory in CUDA. Pass the data into an OpenCV GPU and CPU mat, and continue operations from there.

OpenCV - copy CUDA device data into GPU Mat

1 Answers