I'm looking at the CUDA SDK convolution with separable kernels, and I have a simple question but can't find an answer:
Do the vectors, whose convolution gives the kernel, need to have the same size? Can I first perform a row-convolution with a vector 1x3 and then a column convolution with another one 5x1 ? Or they both need to be same size? Google isn't helping (or I'm unable to search for an answer)