I am using Keras and Tensorflow to make a kind-of online learning, where I receive new data periodically and I retrain my models with this new data. I can have several models stored in ".h5" files so that when i need to train or predict I load the model and then I perform the necessary operations.
Currently I separated the training and the predictions in two different threads, so that predictions can be made while the other thread trains. With locks I try to make sure that no prediction or training is done in the same model at the same time (I think this works), but I am aware that keras is not so prepared for this. I always some different errors regarding the graph or session of tensorflow, for instance:
Traceback (most recent call last): File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\flask\app.py", line 2292, in wsgi_app response = self.full_dispatch_request() File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\flask\app.py", line 1815, in full_dispatch_request rv = self.handle_user_exception(e) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\flask\app.py", line 1718, in handle_user_exception reraise(exc_type, exc_value, tb) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\flask_compat.py", line 35, in reraise raise value File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\flask\app.py", line 1813, in full_dispatch_request rv = self.dispatch_request() File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\flask\app.py", line 1799, in dispatch_request return self.view_functionsrule.endpoint File "C:\Users\a703572\PycharmProjects\ai-pred-eng\src\run_keras_server.py", line 859, in predict_times 0] + '.h5') File "C:\Users\a703572\PycharmProjects\ai-pred-eng\src\run_keras_server.py", line 164, in get_prediction model, scaler = self.load_model_file(self.graph_pred, self.session, path) File "C:\Users\a703572\PycharmProjects\ai-pred-eng\src\run_keras_server.py", line 114, in load_model_file model = load_model(path) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\saving.py", line 419, in load_model model = _deserialize_model(f, custom_objects, compile) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\saving.py", line 287, in _deserialize_model K.batch_set_value(weight_value_tuples) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\backend\tensorflow_backend.py", line 2470, in batch_set_value get_session().run(assign_ops, feed_dict=feed_dict) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\backend\tensorflow_backend.py", line 206, in get_session session.run(tf.variables_initializer(uninitialized_vars)) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variables.py", line 2831, in variables_initializer return control_flow_ops.group(*[v.initializer for v in var_list], name=name) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3432, in group return _GroupControlDeps(dev, deps, name=name) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3384, in _GroupControlDeps return no_op(name=name) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\contextlib.py", line 88, in exit next(self.gen) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 4249, in device self._device_function_stack.pop_obj() File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\traceable_stack.py", line 110, in pop_obj return self._stack.pop().obj IndexError: pop from empty list
Or the error:
Exception in thread Thread-1: Traceback (most recent call last): File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\threading.py", line 916, in _bootstrap_inner self.run() File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\threading.py", line 1182, in run self.function(*self.args, **self.kwargs) File "C:\Users\a703572\PycharmProjects\ai-pred-eng\src\run_keras_server.py", line 632, in train self.update_prediction_historics_all() File "C:\Users\a703572\PycharmProjects\ai-pred-eng\src\run_keras_server.py", line 649, in update_prediction_historics_all self.update_prediction_historics_dataset(new_dataset, loadModel=True) File "C:\Users\a703572\PycharmProjects\ai-pred-eng\src\run_keras_server.py", line 672, in update_prediction_historics_dataset 0] + ".h5", loadModel=loadModel)[ File "C:\Users\a703572\PycharmProjects\ai-pred-eng\src\run_keras_server.py", line 198, in get_predictions_sequential model, scaler = self.load_model_file(self.graph_pred, self.session, path) File "C:\Users\a703572\PycharmProjects\ai-pred-eng\src\run_keras_server.py", line 114, in load_model_file model = load_model(path) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\saving.py", line 419, in load_model model = _deserialize_model(f, custom_objects, compile) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\saving.py", line 225, in _deserialize_model model = model_from_config(model_config, custom_objects=custom_objects) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\saving.py", line 458, in model_from_config return deserialize(config, custom_objects=custom_objects) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\layers__init__.py", line 55, in deserialize printable_module_name='layer') File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\utils\generic_utils.py", line 145, in deserialize_keras_object list(custom_objects.items()))) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\sequential.py", line 301, in from_config model.add(layer) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\sequential.py", line 181, in add output_tensor = layer(self.outputs[0]) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\base_layer.py", line 431, in call self.build(unpack_singleton(input_shapes)) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\layers\core.py", line 872, in build constraint=self.bias_constraint) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\legacy\interfaces.py", line 91, in wrapper return func(*args, **kwargs) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\engine\base_layer.py", line 252, in add_weight constraint=constraint) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\site-packages\keras\backend\tensorflow_backend.py", line 402, in variable v = tf.Variable(value, dtype=tf.as_dtype(dtype), name=name) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variables.py", line 183, in call return cls._variable_v1_call(*args, **kwargs) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variables.py", line 146, in _variable_v1_call aggregation=aggregation) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variables.py", line 125, in previous_getter = lambda **kwargs: default_variable_creator(None, **kwargs) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variable_scope.py", line 2444, in default_variable_creator expected_shape=expected_shape, import_scope=import_scope) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variables.py", line 187, in call return super(VariableMetaclass, cls).call(*args, **kwargs) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variables.py", line 1329, in init constraint=constraint) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\ops\variables.py", line 1492, in _init_from_args ops.add_to_collections(collections, self) File "C:\Users\a703572\AppData\Local\Programs\Python\Python36\lib\contextlib.py", line 88, in exit next(self.gen) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 5347, in init_scope yield File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 4369, in exit self._graph._pop_control_dependencies_controller(self) File "C:\Users\a703572\AppData\Roaming\Python\Python36\site-packages\tensorflow\python\framework\ops.py", line 4390, in _pop_control_dependencies_controller assert self._control_dependencies_stack[-1] is controller AssertionError
My solution was using a graph for prediction and a graph for training, and every time I want to perform a tf operation I use:
with server_predict.graph_pred.as_default():
with tf.Session(graph=server_predict.graph_pred) as sess:
And I also added the line:
backend.set_session(sess)
Despite this, I keep having the errors coming from the tf session or graph, as It seems that the operations are not properly separated. Another error is the one I wrote in this issue that is still opened, regarding the tf session. Solution given using k.clear_session() (k = keras backend) did not work for me.
Does any one have had a similar problem or has programmed a similar task that might help me?
Thanks!!
Found a "wrap" to make this work. Instead of launching two threads over the same class (custom), what I have is two objects of the same class, one is dedicated to training and the other to predict. This is not a real multithread app (even though the two objects are launched from the same main). Until I (we) find a proper multithread solution this might help.
However I do not understand know how I got the errors before, and just by having two objects not, even if these objects run in the same process. Is it that keras/tensorflow can only make operations on only one graph but defines different graphs for different objects on the same process?