Say I have n
training samples and a binary classification task. I want to train a decision tree of smallest possible depth and having fewest possible total nodes such that the training accuracy on these n
samples is 100%. In the worst case, this would mean that I have one leaf node per sample. Is there some configuration of parameters in Scikit-Learn's implementation [1] of the DecisionTreeClassifier
that would let me achieve this?
0
votes
1 Answers
1
votes
Answer
By reading the documentation you get your answer.
If you dont set a limit to max_depth
the tree will keep expanding to the deepest leaf.
Also you can check here similar question.
max_depth
: The maximum depth of the tree. If None, then nodes are expanded until all leaves are pure or until all leaves contain less than min_samples_split samples. – ombkmax_depth
sets an upper limit on the depth. But if you set (say)max_depth
= 1000, it is not always the case thatclf.get_depth() == max_depth
. – madman_with_a_box