I had a problem to classify inputs which have more than one label. So problem is multi-label classification. I used scikit-learn Decision Tree classifiers to do this and it gives pretty good results at initial stages. But, I am wondering how is it working under the hood and How the split is done in Decision Tree for multi-label classification? The important question is about how one model which is initialized once can be trained with two different class of labels at the same time? How the Decision Tree model will solve out the optimization task for both different sets of labels?
0
votes
1 Answers
0
votes
Under the hood, each node in your decision tree has the same labels as the root node, however, the probability of each label is different. When you run model.predict(), the model gives the prediction as the label with the highest probability. You can use model.predict_proba() to see the probability for each label separately. You can use this code to get the probabilities correctly:
all_probs=pd.DataFrame(model.predict_proba(X_test),columns=model.classes_)