3
votes

Background

My question is based off an example from Hands-On Machine Learning by Geron, Chapter 12: Custom Models.

The purpose of this example is to create a custom neural network model. The model has 5 Dense hidden layers. The custom part is that we add a reconstruction layer before the output. The purpose of the reconstruction layer is to reconstruct the inputs. Then we take the difference reconstruction-inputs, get the MSE, and apply this value to the loss function. It's supposed to be a regularization step.

Minimum (should be) Working Example

The following code is almost directly from the textbook, but it doesn't work.

import numpy as np

num_training=10;
num_dim=2;

X = np.random.random((10,2))
y = np.random.random(10)

import tensorflow as tf
import tensorflow.keras as keras

class ReconstructingRegressor(keras.models.Model):
    def __init__(self, output_dim, **kwargs):
        super().__init__(**kwargs)
        self.hidden = [keras.layers.Dense(30, activation="selu",
                                          kernel_initializer="lecun_normal")
                       for _ in range(5)]
        self.out = keras.layers.Dense(output_dim)

    def build(self, batch_input_shape):
        n_inputs = batch_input_shape[-1]
        self.reconstruct = keras.layers.Dense(n_inputs)
        super().build(batch_input_shape)

    def call(self, inputs, training=None):
        Z = inputs
        for layer in self.hidden:
            Z = layer(Z)
        reconstruction = self.reconstruct(Z)
        recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
        self.add_loss(0.05 * recon_loss)
        return self.out(Z)
    
model = ReconstructingRegressor(1)
model.compile(loss="mse", optimizer="nadam")
history = model.fit(X, y, epochs=2)

Error Message

However, I get the following error while calling model.fit():

---------------------------------------------------------------------------
InaccessibleTensorError                   Traceback (most recent call last)
<ipython-input-10-b7211d3022fa> in <module>
     34 model = ReconstructingRegressor(1)
     35 model.compile(loss="mse", optimizer="nadam")
---> 36 history = model.fit(X, y, epochs=2)

and, at the end of the error message:

InaccessibleTensorError: The tensor 'Tensor("mul:0", shape=(), dtype=float32)' cannot be accessed here: it is defined in another function or code block. Use return values, explicit Python locals or TensorFlow collections to access it. Defined in: FuncGraph(name=build_graph, id=140602287140624); accessed from: FuncGraph(name=train_function, id=140602287108640).

Troubleshooting

If I comment out the code that computes the loss, i.e.,

        #recon_loss = tf.reduce_mean(tf.square(reconstruction - inputs))
        #self.add_loss(0.05 * recon_loss)

in call, but I keep everything else the same, then I get the following warning

WARNING:tensorflow:Gradients do not exist for variables ['dense/kernel:0', 'dense/bias:0'] when minimizing the loss.

Not sure if that's relevant.

1
This is the full code. self.add_loss is a method of the parent class keras.models.Model. We are declaring the member variable self.reconstruct in the build function. It is defined as a Keras layer.EssentialAnonymity
I've tried moving the self.reconstruct definition to init(), but it doesn't seem to change anything. Note that moving self.reconstruct defeats the purpose of the example because n_inputs is only known once you get inside build.EssentialAnonymity

1 Answers

2
votes

I am not 100% sure, but I believe the problem is due to the fact that the loss that you are adding via self.add_loss is referring to layers that are not used during the calculation of the main loss, and are possibly optimized out of the main graph. Hence when you want to access them, the tensor are inaccessible.

I think the easiest way is to rewrite the network slightly differently:

Using the training argument of model.call, we set to use the reconstruct layer only during training, and we make the network return bot the prediction and the reconstruction. When we want to make a prediction however, we return only the prediction.

Overriding train_step is just here to be able to still use fit, and not to have to write the training loop from scratch. We don't need to override test_step in that case, because the use case if fairly simple.

import tensorflow as tf
import tensorflow.keras as keras
import numpy as np

num_training = 10
num_dim = 2

X = np.random.random((10, 2)).astype(np.float32)
y = np.random.random((10,)).astype(np.float32)


class ReconstructingRegressor(keras.models.Model):
    def __init__(self, output_dim, **kwargs):
        super().__init__(**kwargs)
        self.hidden = [
            keras.layers.Dense(
                30,
                activation="selu",
                kernel_initializer="lecun_normal",
                name=f"hidden_{idx}",
            )
            for idx in range(5)
        ]
        self.out = keras.layers.Dense(output_dim, name="output")

    def build(self, batch_input_shape):
        n_inputs = batch_input_shape[-1]
        self.reconstruct = keras.layers.Dense(n_inputs, name="reconstruct")
        super().build(batch_input_shape)

    @staticmethod
    def reconstruction_loss(reconstruction, inputs, rate=0.05):
        return tf.reduce_mean(tf.square(reconstruction - inputs)) * rate

    def train_step(self, data):
        x, y = data
        with tf.GradientTape() as tape:
            y_pred, recon = self(x, training=True)
            loss = self.compiled_loss(y, y_pred)
            loss += self.reconstruction_loss(recon, x)
        gradients = tape.gradient(loss, self.trainable_variables)

        # Update weights
        self.optimizer.apply_gradients(zip(gradients, self.trainable_variables))

        # Update metrics (includes the metric that tracks the loss)
        self.compiled_metrics.update_state(y, y_pred)
        # Return a dict mapping metric names to current value
        return {m.name: m.result() for m in self.metrics}

    def call(self, inputs, training=None):
        Z = inputs
        for layer in self.hidden:
            Z = layer(Z)
        if training:
            return self.out(Z), self.reconstruct(Z)

        return self.out(Z)


model = ReconstructingRegressor(1)

model.compile(optimizer="nadam", loss="mse")
history = model.fit(X, y, epochs=10)
history = model.evaluate(X, y)