The initial state or constants of an RNN layer cannot be specified with a mix of Keras tensors and non-Keras tensors

Question

As we know the decoder takes the encoder hidden states as the initial state ...

encoder_output , state_h, state_c = LSTM(cellsize,  return_state=True)(embedded_encoder_input)
encoder_states = [state_h, state_c]

decoder_lstm = LSTM(cellsize,  return_state=True, return_sequences=True)
decoder_outputs, state_dh, state_dc = decoder_lstm(embedded_decoder_inputs, initial_state=encoder_states)

Assume I want to replace the initial state of the decoder to be encoder_output and features from me from other resources

encoder_states = [encoder_output , my_state]

But I face the following error:

ValueError: The initial state or constants of an RNN layer cannot be specified with a mix of Keras tensors and non-Keras tensors (a "Keras tensor" is a tensor that was returned by a Keras layer, or by Input)

Although I print state_h & stat_c & encoder_output & my_state, all have the same type and shape, example:

state_h:  Tensor("lstm_57/while/Exit_2:0", shape=(?, 128), dtype=float32)
my_state:  Tensor("Reshape_17:0", shape=(?, 128), dtype=float32)

What am I understanding that it will not accept inputs not produced from the previous layer, and as Keras tensor?

Update

After convert tensor to Keras tensor, The new error:

ValueError: Input tensors to a Model must come from keras.layers.Input. Received: Tensor("Reshape_18:0", shape=(?, 128), dtype=float32) (missing previous layer metadata).

The string version of Keras layers print the underlying tensor so you won't see a different. Where are you getting my_state from? — nuric
I can see that, but where do you get it from? Is it created using tf or is it passed from somewhere else, do you generate it, is it an input? Depending on it's source you would need the appropriate wrapper for Keras. — nuric

giser_yugang giser_yugang · Accepted Answer · 2019-01-20T03:20:34

I guess you mixed tensorflow tensor and keras tensor. Although the results of state_h and my_state are tensor, they are actually different. You can use K.is_keras_tensor() to distinguish them. An example:

import tensorflow as tf
import keras.backend as K
from keras.layers import LSTM,Input,Lambda

my_state = Input(shape=(128,))
print('keras input layer type:')
print(my_state)
print(K.is_keras_tensor(my_state))

my_state = tf.placeholder(shape=(None,128),dtype=tf.float32)

print('\ntensorflow tensor type:')
print(my_state)
print(K.is_keras_tensor(my_state))

# you may need it
my_state = Lambda(lambda x:x)(my_state)
print('\nconvert tensorflow to keras tensor:')
print(my_state)
print(K.is_keras_tensor(my_state))

# print
keras input layer type:
Tensor("input_3:0", shape=(?, 128), dtype=float32)
True

tensorflow tensor type:
Tensor("Placeholder:0", shape=(?, 128), dtype=float32)
False

convert tensorflow to keras tensor:
Tensor("lambda_1/Identity:0", shape=(?, 128), dtype=float32)
True

The initial state or constants of an RNN layer cannot be specified with a mix of Keras tensors and non-Keras tensors

1 Answers