I need to convert one-hot encoding to categories represented by unique integers. So one-hot encoding created with the following code:
from sklearn.preprocessing import OneHotEncoder
enc = OneHotEncoder()
labels = [[1],[2],[3]]
enc.fit(labels)
for x in [1,2,3]:
print(enc.transform([[x]]).toarray())
Out:
[[ 1. 0. 0.]]
[[ 0. 1. 0.]]
[[ 0. 0. 1.]]
Could be converted back to a set of unique integers, for example:
[1,2,3] or [11,37, 45] or any other where each integer uniquely represents a single class.
Is it possible to do with scikit-learn or any other python lib?
* Update *
Tried to:
labels = [[1],[2],[3], [4], [5],[6],[7]]
enc.fit(labels)
lst = []
for x in [1,2,3,4,5,6,7]:
lst.append(enc.transform([[x]]).toarray())
lst
Out:
[array([[ 1., 0., 0., 0., 0., 0., 0.]]),
array([[ 0., 1., 0., 0., 0., 0., 0.]]),
array([[ 0., 0., 1., 0., 0., 0., 0.]]),
array([[ 0., 0., 0., 1., 0., 0., 0.]]),
array([[ 0., 0., 0., 0., 1., 0., 0.]]),
array([[ 0., 0., 0., 0., 0., 1., 0.]]),
array([[ 0., 0., 0., 0., 0., 0., 1.]])]
a = np.array(lst)
np.where(a==1)[1]
Out:
array([0, 0, 0, 0, 0, 0, 0], dtype=int64)
Not what I need