I found that using pyrDown and pyrUp makes my DownUp full of zeros for some odd reason. However, when I do this normally on the cpu, the results are perfectly fine.
NOTE : I'm using opencv4tegra on the jetson tk1 if that matters at all.
for (int i = 0; i < Pyramid_Size; i++) {
cv::gpu::pyrDown(DownUp, DownUp);
}
for (int i = 0; i < Pyramid_Size; i++){
cv::gpu::pyrUp(DownUp, DownUp);
}
Anyone know why this may be?
edit:
DownUp.upload(Input);
GpuMat buffer;
DownUp.copyTo(buffer);
for (int i = 0; i < Pyramid_Size; i++, DownUp.copyTo(buffer)) {
cv::gpu::pyrDown(buffer, DownUp);
}
for (int i = 0; i < Pyramid_Size; i++, DownUp.copyTo(buffer)){
cv::gpu::pyrUp(buffer, DownUp);
GpuMat a = GpuMat(DownUp.size(), CV_32F);
a.setTo(20.0f);
cv::gpu::add(DownUp, a, DownUp);
}
this is now working in my code but it is SIGNIFICANTLY slower than the cpu version. This gpu version takes around 1.6-2 seconds total to run and the cpu takes 0.1 seconds.
I also noticed the amount of time it takes to send data from host to device takes a lot longer than it does to simply process on the cpu. Is there anyway in opencv to speed this up? I'm definitely doing something wrong, even large 5mp images are faster to down / up sample on the cpu.