I am running a monte carlo simulation using Thrust on an Nvidia card with 2.1 compute capability. If I try to transform_reduce the whole device_vector at once, I get the following error. Its not a matter of using up the memory on device because the vectors are that big (~1-10mb). I know my code is right because it works if I compile with openmp and run on the host only. What can be causing this problem?
Unhandled exception at 0x776e15de in mccva.exe: Microsoft C++ exception: thrust::system::system_error at memory location 0x0014cb28.
But if I do the transform_reduce in chunks it works fine until I scale the number of timesteps in the simulation which it then gives the same error.
//run the Monte Carlo simulation
zpath * norm_ptr = thrust::raw_pointer_cast(&z[0]);
cout << "initialized raw pointer" << endl;
thrust::device_vector<ctrparty> devctrp = ctrp;
assert(devctrp.size()==ctrp.size());
cout << "Initialized device vector" << endl;
cout << "copied host vec to device vec" << endl;
float cva = 0;
for(unsigned int i=0; i<5; i++)
{
if(i<4)
cva += (1-R) * thrust::transform_reduce(devctrp.begin()+i*2000, devctrp.begin() + (i+1)*2000 - 1, calc(norm_ptr, dt, r, sims, N), 0.0f, sum());
else
cva += (1-R) * thrust::transform_reduce(devctrp.begin()+i*2000, devctrp.begin() + (i+1)*2000, calc(norm_ptr, dt, r, sims, N), 0.0f, sum());
}
I get the error when I try this:
float cva = 0.0f;
try
{
cva = thrust::transform_reduce(devctrp.begin(), devctrp.end(), calc(norm_ptr, dt, r, sims, N), 0.0f, sum()); //get the simulated CVA
}
catch(thrust::system_error e)
{
printf(e.what());
}
I'm using VS2010 and when it breaks at the errors it points to the following in the dbgheap.c file.
__finally {
/* unlock the heap
*/
_munlock(_HEAP_LOCK);
}
calc()
andsum()
? One of those may be the issue. You could try doing just athrust::transform
withcalc
and just athrust::reduce
withsum()
to see if you can narrow down the source of the error. For instance,norm_ptr
points to the device arrayz
. I don't know howcalc
uses it exactly, but if it is indexing throughz
in some fashion, then perhaps when you increase the length of the transform, you're running into trouble there. It's just speculation, but it would help to see a more complete description of what you are doing in the transform – Robert Crovella