0
votes

I'm inheriting from sklearn.ensemble import RandomForestClassifier, and I'm trying to print my new estimator:

class my_rf(RandomForestClassifier):
    def __str__(self):
        return "foo_" + RandomForestClassifier.__str__(self) 

gives foo_my_rf()

I also tried:

class my_rf(RandomForestClassifier):
    def __str__(self):
        return "foo_" + super(RandomForestClassifier, self).__str__() 

with the same result. expected is something pretty like sklearn default behaviour:

>>> a = RandomForestClassifier()
>>> print a
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
        max_depth=None, max_features='auto', max_leaf_nodes=None,
        min_samples_leaf=1, min_samples_split=2,
        min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
        oob_score=False, random_state=None, verbose=0,
        warm_start=False)
>>>

This is also the result when I use print a.__str__().

What am I missing? Thanks.

related to How do I change the string representation of a Python class?

1
Evidently the parent class __str__ implementation is the name of the class. You are calling it correctly.jonrsharpe
@jonrsharpe - oops, edited the question to make clear what am I looking for.ihadanny
Have you tried looking at the __repr__ instead?jonrsharpe
@jonrsharpe - yes, same result. I must be missing something with objects vs. classes, in python I always do :)ihadanny
Shouldn't it be super(my_rf, self).__str__? By specifying super(RandomForestClassifier, self), you are effectively skipping RandomForestClassifier's implementation of __str__.user4815162342

1 Answers

0
votes

In RandomForestClassifier both __repr__ and __str__ lookup the name of the class of the instance they are called from (self). You should directly reference the name of the superclass.

Update This is how you can get your desired output, though I don't get, why would you want something like that. There is a reason why RandomForestClassifier's __str__ and __repr__ return the actual name of a class. That way you can eval to restore the object. Anyway,

In [1]: from sklearn.ensemble import RandomForestClassifier
In [2]: class my_rf(RandomForestClassifier):
    def __str__(self):
        superclass_name = RandomForestClassifier.__name__
        return "foo_" + superclass_name + "(" + RandomForestClassifier.__str__(self).split("(", 1)[1]

In [3]: forest = my_rf()
In [4]: print forest
foo_RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini', max_depth=None,
   max_features='auto', max_leaf_nodes=None, min_samples_leaf=1,
   min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10,
   n_jobs=1, oob_score=False, random_state=None, verbose=0,
   warm_start=False)

Update 2 You get no parameters when you override __init__, because in the superclass __str__ and __repr__ are implemented to scan the list of arguments passed to __init__. You can clearly see it by running this code:

In [5]: class my_rf(RandomForestClassifier):
    def __init__(self, *args, **kwargs):
        RandomForestClassifier.__init__(self, *args, **kwargs)
    def __str__(self):
        superclass_name = RandomForestClassifier.__name__
        return "foo_" + superclass_name + "(" + RandomForestClassifier.__str__(self).split("(", 1)[1]
In [6]: forest = my_rf()
In [7]: print forest
...
RuntimeError: scikit-learn estimators should always specify their parameters in the signature of their __init__ (no varargs). <class '__main__.my_rf'> with constructor (<self>, *args, **kwargs) doesn't  follow this convention.