I have done multi-class classification with scikit. But I want independent prediction of each class results, instead of summing them into 1.
I know, its as similar to Multi-label. But have to generate independent 0-1 value for each classes in the predicted output.
clf = OneVsRestClassifier(SGDClassifier(alpha=0.001, loss="log", random_state=42,
max_iter=100, shuffle=True, verbose=1))
Output:
[0.04188954 0.01330129 0.01330501 0.02050405 0.03726504 0.01412006
0.01753864 0.01250115 0.02342872 0.0124999 0.05234852 0.0161394
0.01250032 0.01330749 0.01403075 0.0149792 0.0125048 0.01250406
0.01412335 0.01413113 0.01412246 0.06543099 0.01249486 0.01250054
0.01308784 0.01330463 0.01250242 0.02252353 0.02037271 0.0133038
0.01250215 0.0125009 0.01537566 0.02023355 0.01600915 0.01762224
0.01496796 0.01496522 0.01412407 0.01250198 0.01239722 0.01249967
0.01763284 0.01573462 0.01250276 0.01451515 0.01330437 0.01329294
0.01249999 0.01485671 0.01249419 0.01858113 0.01250192 0.01585085
0.01330439 0.01250573 0.01250585 0.01715666 0.01249392]
Summing this I got 1. But I want each of them to compare with 0-1 independently. How could its possible?
As per scikit notes, "In the single label multiclass case, the rows of the returned matrix sum to 1."
Ref: https://scikit-learn.org/stable/modules/generated/sklearn.multiclass.OneVsRestClassifier.html
How to override this?
Created 2d matrix:
The shape of matrix is (342, 2)
[[ 4 0]
[ 4 0]
[ 4 0]
[ 21 0]
[ 21 0]]
Got error as:
ValueError: Multioutput target data is not supported with label binarization
Using label binarizer I got (349,59) There are 59 labels and 349 samples.
Using MultiOutputClassifier
clf = SGDClassifier(loss="log", random_state=42, verbose=0)
clf = MultiOutputClassifier(clf)
Result:
clf.predict_proba(x_test)
[array([[0.99310559, 0.00689441]]), array([[0.9942846, 0.0057154]]), array([[0.0051056, 0.9948944]])]
Result is having 3 classes.
How do I interpret it into single value? Ex: array([[0.99310559, 0.00689441]]) => 0.5 or o.6