I'm getting the following error when setting my n_jobs parameter > 1 for the random forest regressor. If I set n_jobs=1, everything works.
AttributeError: 'Thread' object has no attribute '_children'
I'm running this code in a flask service. What's interesting is that it does not happen when ran outside of the flask service. I've only repro'd this on a freshly installed Ubuntu box. On my Mac it works just fine.
This is a thread that talked about this, but didn't seem to go anywhere past the workaround: 'Thread' object has no attribute '_children' - django + scikit-learn
Any thoughts on this?
Here is my test code:
@test.route('/testfun')
def testfun():
from sklearn.ensemble import RandomForestRegressor
import numpy as np
train_data = np.array([[1,2,3], [2,1,3]])
target_data = np.array([1,1])
model = RandomForestRegressor(n_jobs=2)
model.fit(train_data, target_data)
return "yey"
Stacktrace:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1836, in __call__
return self.wsgi_app(environ, start_response)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1820, in wsgi_app
response = self.make_response(self.handle_exception(e))
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1403, in handle_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1817, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1477, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1381, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1475, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python2.7/dist-packages/flask/app.py", line 1461, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/home/vagrant/flask.global-relevance-engine/global_relevance_engine/routes/test.py", line 47, in testfun
model.fit(train_data, target_data)
File "/usr/local/lib/python2.7/dist-packages/sklearn/ensemble/forest.py", line 273, in fit
for i, t in enumerate(trees))
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 574, in __call__
self._pool = ThreadPool(n_jobs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 685, in __init__
Pool.__init__(self, processes, initializer, initargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 136, in __init__
self._repopulate_pool()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 199, in _repopulate_pool
w.start()
File "/usr/lib/python2.7/multiprocessing/dummy/__init__.py", line 73, in start
self._parent._children[self] = None
_children:if hasattr(self._parent, '_children'): self._parent._children[self] = None. When you say it works outside of flask, is that with the exact same environment (interpreter, libraries, os, machine, etc.)? I ask because in my system, line 73 is the condition but in yours it is the assignment. I think your flask environment is using an older version of python where this bug is not fixed. - KobeJohn