1
votes

I have created a Random Forest classifier and I'm trying to produce a histogram of the depths of the trees of my random forest model. I'm just not being able to extract the depth of every tree in my forest.

My RF model is called 'RF_optimised' and I've tried the code below to iterate over my trees and visualise which has worked. I have gone through the estimators_ and export_graphviz documentation but there doesn't seem to be a way to extract the actual depth of tree.

from sklearn import tree
from sklearn.tree import export_graphviz
from sklearn.externals.six import StringIO

# Create a string buffer to write to (a fake text file)
f = StringIO()

i_tree = 0
for tree_in_forest in RF_optimised.estimators_:

    export_graphviz(tree_in_forest,out_file=f,
    #feature_names=col,
    filled=True,
    rounded=True,
    proportion=True)

    graph = pydotplus.graph_from_dot_data(f.getvalue())
    display(Image(graph.create_png()))

I need a function that iterates over the trees in my Random Forest and stores the depth of the trees in a list or data-frame, in order to produce a histogram later. Can anyone help?

1

1 Answers

1
votes

Some exploration in the interpreter shows that each Tree instance has a max_depth parameter which appears to be what I'm looking for -- again, it's undocumented.

[estimator.tree_.max_depth for estimator in RF_optimised.estimators_]

did the trick for me :)