8
votes

Thanks everyone in advance for your help! What I'm trying to do in PyTorch is something like numpy's setdiff1d. For example given the below two tensors:

t1 = torch.tensor([1, 9, 12, 5, 24]).to('cuda:0')
t2 = torch.tensor([1, 24]).to('cuda:0')

The expected output should be (sorted or unsorted):

torch.tensor([9, 12, 5])

Ideally the operations are done on GPU and no back and forth between GPU and CPU. Much appreciated!

3
You can use numpy operations directly on torch tensors without a copy: torch.from_numpy(np.setdiff1d(t1.numpy(),t2.numpy())) - romeric
Thank you very much @romeric and my apologies that my question was not clearly phrased. I was hoping to use CUDA tensors for this and keep the operations on GPU only, while converting to ndarray requires tensors to be sent back to cpu first. - Shiki.E

3 Answers

3
votes

if you don't want to leave cuda, a workaround could be:

t1 = torch.tensor([1, 9, 12, 5, 24], device = 'cuda')
t2 = torch.tensor([1, 24], device = 'cuda')
indices = torch.ones_like(t1, dtype = torch.uint8, device = 'cuda')
for elem in t2:
    indices = indices & (t1 != elem)  
intersection = t1[indices]  
9
votes

I came across the same problem but the proposed solutions were far too slow when using larger arrays. The following simple solution works on CPU and GPU and is significantly faster than the other proposed solutions:

combined = torch.cat((t1, t2))
uniques, counts = combined.unique(return_counts=True)
difference = uniques[counts == 1]
intersection = uniques[counts > 1]
2
votes

If you don't want a for loop this can compare all values in one go.

Also you can get the non intersection easily too

t1 = torch.tensor([1, 9, 12, 5, 24])
t2 = torch.tensor([1, 24])

# Create a tensor to compare all values at once
compareview = t2.repeat(t1.shape[0],1).T

# Intersection
print(t1[(compareview == t1).T.sum(1)==1])
# Non Intersection
print(t1[(compareview != t1).T.prod(1)==1])
tensor([ 1, 24])
tensor([ 9, 12,  5])