I'm writing a cuda c code to process pictures for example i created a swap function (swap blocs of the matrix) but it dos not work every time i thing i have a problem with the number of blocs and number of threads whene i lunch my kernel.
For example if i tak an image of size 2048*2048 with
threadsPerBlock.x=threadsPerBlock.y=64 and numBlocks.x=numBlocks.y=2048/threadsPerBlock.x
then swap<<<threadsPerBlock,numBlocks>>>(...) works fine.
But if I take an image of size 2560*2160, threadsPerBlock.x=threadsPerBlock.y=64 and numBlocks.x=2560/64 and numBlocks.y=2160/64+1, I have an error 9 wish is error invalid configuration argument.
I'm using CUDA 7.5 and a GPU with compute capability 5.0