How to reverse One-Hot Encoding of labels for evaluation of ML/DL model?

Question

This issue has been mentioned a few times here on Stackoverflow, but none provided the solution for the problem/error I'm currently facing.

Currently my y of the dataset that I use as labels had to be transformed using One-Hot Encoding so that my Deep Learning network/model could handle it as a categorical_crossentropy.

But now the problem arises that for the evaluation of my data, it needs the original labels again for the prediction of y.

import pandas as pd
import numpy as np

keypoints = pd.read_csv('keypoints.csv')

X = keypoints.iloc[:,1:76]
y = keypoints.iloc[:,-1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0, stratify=y)

Here y is a list of 3 different labels, let's say A,B and C with a shape of (63564, 1)

So using the One-Hot encoding I was able to split it up:

le = LabelEncoder()
y = le.fit_transform(y)
ohe = OneHotEncoder(categorical_features = [0])
y = ohe.fit_transform(y[:,None]).toarray()

The new y here has a shape of (63564, 3) and looks like:

[[0. 0. 1.]
 [0. 0. 1.]
 [0. 0. 1.]
 ...
 [1. 0. 0.]
 [1. 0. 0.]
 [1. 0. 0.]]

After running my Deep Learning network I want to evaluate it by using:

......
#Evaluation and such
y_pred = model.predict(X_test, verbose=0)
y_classes = model.predict_classes(X_test, verbose=0)

#Reduce to 1D
y_pred = y_pred[:, 0]
y_classes = y_classes[:, 0]

#Confution Matrix
print(confusion_matrix(y_test, y_classes))

#Accuracy: (tp + tn) / (p + n)
accuracy = accuracy_score(y_test, y_classes)
print('Accuracy: %f' % accuracy)
#Precision tp / (tp + fp)
precision = precision_score(y_test, y_classes)
print('Precision: %f' % precision)
#Recall: tp / (tp + fn)
recall = recall_score(y_test, y_classes)
print('Recall: %f' % recall)
#F1: 2 tp / (2 tp + fp + fn)
f1 = f1_score(y_test, y_classes)
print('F1 score: %f' % f1)

But ofcourse this won't accept the 0 and 1 as labels:

ValueError: Classification metrics can't handle a mix of unknown and continuous-multioutput targets

So my question is

How do i reverese the One-Hot Encoded labels so that I can run the evaluation of my DL model?

Drey Drey · Accepted Answer · 2019-08-10T21:19:36

You probably will need inverse_transform as documented in the examples section of sklearn.preprocessing.OneHotEncoder

>>> from sklearn.preprocessing import OneHotEncoder
>>> enc = OneHotEncoder(handle_unknown='ignore')
>>> X = [['Male', 1], ['Female', 3], ['Female', 2]]
>>> enc.transform([['Female', 1], ['Male', 4]]).toarray()
array([[1., 0., 1., 0., 0.],
       [0., 1., 0., 0., 0.]])
>>> enc.inverse_transform([[0, 1, 1, 0, 0], [0, 0, 0, 1, 0]])
array([['Male', 1],
       [None, 2]], dtype=object)

How to reverse One-Hot Encoding of labels for evaluation of ML/DL model?

3 Answers