FYI, I have 64 bit version of Python 2.7 and I followed the pycuda installation instruction to install pycuda.
And I don't have any problem running following script.
import pycuda.driver as cuda
import pycuda.autoinit
from pycuda.compiler import SourceModule
import numpy
a = numpy.random.randn(4,4)
a = a.astype(numpy.float32)
a_gpu = cuda.mem_alloc(a.nbytes)
cuda.memcpy_htod(a_gpu,a)
But after that, when executing this statement,
mod = SourceModule("""
__global__ void doublify(float *a)
{
int idx = threadIdx.x + threadIdx.y * 4;
a[idx] *= 2;
}
""")
I got the error messages
CompileError: nvcc compilation of c:\users\xxxx\appdata\local\temp\tmpaoxt97\kernel.cu failed [command: nvcc --cubin -arch sm_21 -m64 -Ic:\python27\lib\site-packages\pycuda\cuda kernel.cu] [stderr: nvcc : fatal error : nvcc cannot find a supported version of Microsoft Visual Studio. Only the versions 2008, 2010, and 2012 are supported
But I have VS 2008 and VS 2010 installed on the machine and set path and nvcc profile as instructed. Anybody tell me what's going on?
UPDATE1: As cgohike pointed out, running following statements before the problematic statement will solve the problem.
import os
os.system("vcvarsamd64.bat")