As in title, in cuda programs, where does the kernel parameter resides after kernel launch, in local memory or global memory of GPU?
For example, in LLVM IR of a cuda program:
__global__ kernel(int param1):
%0 = alloca int
store param1, %0
So, in this case, where does %0 point to? local memory or global memory?
Also, I saw sometimes kernel parameters are held and use directly in registers instead of storing it in any memory. How this decision is made?