How to map element in pytorch tensor to id?

Question

Given a tensor:

A = torch.tensor([2., 3., 4., 5., 6., 7.])

Then, give each element in A an id:

id = torch.arange(A.shape[0], dtype = torch.int)   # tensor([0,1,2,3,4,5])

In other words, id of 2. in A is 0 and id of 3. in A is 1:

2. -> 0
3. -> 1
4. -> 2
5. -> 3
6. -> 4
7. -> 5

Then, I have a new tensor:

B = torch.tensor([3., 6., 6., 5., 4., 4., 4.])

In pytorch, is there any way in Pytorch to map each element in B to id? In other words, I want to obtain tensor([1, 4, 4, 3, 2, 2, 2]), in which each element is id of the element in B.

Ivan Ivan · Accepted Answer · 2021-01-04T18:04:55

I don't think there is such a function in PyTorch to map a tensor.

It seems quite unreasonable to solve this by comparing each value from B to values from B.

Here are two possible solutions to solve this problem.

Using a dictionary as a map

You can use a dictionary. Not so not much of a pure-PyTorch solution but will most probably be the fastest and safest way...

Just create a dict to map each element to an id, then use it to map B:

>>> map = {x.item(): i for i, x in enumerate(A)}

>>> torch.tensor([map[x.item()] for x in B])
tensor([1, 4, 4, 3, 2, 2, 2])

Change of basis approach

An alternative only using torch.Tensors. This will require the values you want to map - the content of A - to be integers because they will be used to index a tensor.

Encode the content of A into one-hot encodings:

>>> A_enc = torch.zeros((int(A.max())+1,)*2)
>>> A_enc[A, torch.arange(A.shape[0])] = 1

>>> A_enc
tensor([[0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0.]])

We'll use A_enc as our basis to map integers:

>>> v = torch.argmax(A_enc, dim=0)
tensor([0, 0, 0, 1, 2, 3, 4, 5])

Now, given an integer for instance x=3, we can encode it into a one-hot-encoding: x_enc = [0, 0, 0, 1, 0, 0, 0, 0]. Then, use v to map it. With a simple dot product you can get the mapping of x_enc: here <v/x_enc> gives 1 which is the desired result (first element of mapped-B). But instead of giving x_enc, we will compute the matrix multiplication between v and encoded-B. First encode B then compute the matrix multiplcition vxB_enc:

>>> B_enc = torch.zeros(A_enc.shape[0], B.shape[0])
>>> B_enc[B, torch.arange(B.shape[0])] = 1

>>> B_enc
tensor([[0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [1., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 1., 1.],
        [0., 0., 0., 1., 0., 0., 0.],
        [0., 1., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.]])

>>> v@B_enc.long()
tensor([1, 4, 4, 3, 2, 2, 2])

Note - you will have to define your tensors with Long type.

How to map element in pytorch tensor to id?

3 Answers

Using a dictionary as a map

Change of basis approach