I'm creating a PyTorch C++ extension and after much research I can't figure out how to index a tensor and update its values. I found out how to iterate over a tensor's entries using the data_ptr() method, but that's not applicable to my use case.
Given is a matrix M, a list of lists (blocks) of index pairs P and a function f: dtype(M)^2 -> dtype(M)^2 that takes two values and spits out two new values.
I'm trying to implement the following pseudo code:
for each block B in P:
for each row R in M:
for each index-pair (i,j) in B:
M[R,i], M[R,j] = f(M[R,i], M[R,j])
After all, this code is going to run on the GPU using CUDA, but since I don't have any experience with that, I wanted to first write a pure C++ program and then convert it.
Can anyone suggest how to do this or how to convert the algorithm to do something equivalent?