0
votes

I have a trained keras model and a custom DataSet implementing keras.utils.Sequence. Now for a given DataSet, I want to predict the model output and save it as numpy array for further calculations. Now I need to verify the order of the data, such that I know which prediction belongs to which input data. I would like to use as much build in functionality to take advantage of keras's gpu and parallel computing capabilities.

My Code:

# load model
model = keras.models.load_model(model_path)

# data is my Dataset, implementing tensorflow.keras.utils.Sequence
print(data.current_epoch_index_order[:10])
predictions = model.predict(data, verbose=2)
print(data.current_epoch_index_order[:10])
# save predictions as npz

I shuffle the epochs in my DataSet myself and current_epoch_index_order is my way of keeping the order internally. In the on_epoch_end I shuffle the batch indices again. The print statements before the predict and after are printing different things, which is okay if the data get's shuffeled after predict() has computed the predictions. I now coud assume that the first call to current_epoch_index_order is the id order that I'm searching for, but how can I verify this? I tried inserting print() statements in my Dataset-Implementation of the Sequence but they somehow don't get printed.

I can't be the only one struggling with this, can I ?

1

1 Answers

0
votes

I figured the easiest way was to implement a version of my DataSet that never shuffles and has a fixed order that I know of.