I am new to CUDA and parallelism. I am looking to get some understanding. You should be familiar with the standard SumArrayOnGPU :
__global__ void
vectorAdd(const float *A, const float *B, float *C, int numElements)
{
int i = blockDim.x * blockIdx.x + threadIdx.x;
if (i < numElements)
{
C[i] = A[i] + B[i];
}
}
What i am looking to do is continually increment a number, run that number through a arbitrary mesh, and only care about the incremented number that returned me a specific answer.
So, the parallelism would be running the same routine on the incremented number.
- Is something like this possible?
- What would be the proper logic?
- Does it still need to be placed into an array setup?
Thought Example :
__global__ void
vectorfind(long long unsigned *nbr, int numElements)
{
long long unsigned tnbr;
int i = blockDim.x * blockIdx.x + threadIdx.x;
if (i < numElements)
{
tnbr = nbr + i;
tnbr = hash(tnbr);
if (tnbr = arbitraryresult) { printf("found");
}
}