Turning off softmax in tensorflow models

Question

All I want to do is download one of tensorflow's built in models (via keras), switch the softmax at the output layer off (i.e. replace it with the linear activation function), so that my output features are the activations on the output layer before softmax is applied.

So, I grab VGG16 as a model, and call it base_model

from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg16 import preprocess_input
base_model = VGG16()

I have a look at the final layer like this:

base_model.get_layer('predictions').get_config()

and get:

{'name': 'predictions',
 'trainable': True,
 'dtype': 'float32',
 'units': 1000,
 'activation': 'softmax',
 'use_bias': True,
 'kernel_initializer': {'class_name': 'GlorotUniform',
  'config': {'seed': None, 'dtype': 'float32'}},
 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}},
 'kernel_regularizer': None,
 'bias_regularizer': None,
 'activity_regularizer': None,
 'kernel_constraint': None,
 'bias_constraint': None}

Then, I do this to switch activation functions:

base_model.get_layer('predictions').activation=tf.compat.v1.keras.activations.linear

and it looks like it works as :

base_model.get_layer('predictions').get_config()

gives:

{'name': 'predictions',
 'trainable': True,
 'dtype': 'float32',
 'units': 1000,
 'activation': 'linear',
 'use_bias': True,
 'kernel_initializer': {'class_name': 'GlorotUniform',
  'config': {'seed': None, 'dtype': 'float32'}},
 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}},
 'kernel_regularizer': None,
 'bias_regularizer': None,
 'activity_regularizer': None,
 'kernel_constraint': None,
 'bias_constraint': None}.

But when I put in a picture, using:

filename = 'test_data/ILSVRC2012_val_00001218.JPEG'
img = image.load_img(filename, target_size=(224, 224)) # loads image
x = image.img_to_array(img) # convets to a numpy array
x = np.expand_dims(x, axis=0) # batches images
x = preprocess_input(x) # prepare the image for the VGG model

and I do a predict on it, to get my features:

features = base_model.predict(x)

The feature still sum to 1, i.e. they look like they have been normalised by softmax as

sum(features[0])

is 1.0000000321741935, which is the exact same number I got when I did this with the softmax activation function on that layer.

I also tried copying out the config dictionary with 'linear' in it, and using set_config on the output layer.

Turning off softmax seems to be bizarrely hard to do in tensorflow: in caffe, you can just switch activation functions for a pre-trained model by just changing one line in the deploy file, so I really don't understand why this is so difficult in tensorflow. I'm after switching my code from caffe to tensorflow, as I thought that it would be easier to use tf to just grab pre-trained models, but this issue is making me reconsider.

I supposed I could try to rip off the prediction layer and replace it with a brand new one with all the same settings (and put the old weights in), but I am sure there must be way to just edit the prediction layer.

I'm using TensorFlow 1.14.0 at the moment, I'm planning to upgrade to 2.0, but I don't think that using tensorflow 1 is the problem here.

Can anyone explain to me how to turn off softmax please? It should be a simple thing to do and I've spent hours on it and have even joined stack overflow just to get this single issue fixed.

Thanks in advance for any help.

Srihari Humbarwadi Srihari Humbarwadi · Accepted Answer · 2019-10-24T18:30:20

As already mentioned above, you can always reverse the softmax operation that should be straight forward. But if you still want to change the activation you will have to copy weights to a new layer.

import tensorflow as tf

model = tf.keras.applications.ResNet50()
assert model.layers[-1].activation == tf.keras.activations.softmax

config = model.layers[-1].get_config()
weights = [x.numpy() for x in model.layers[-1].weights]

config['activation'] = tf.keras.activations.linear
config['name'] = 'logits'

new_layer = tf.keras.layers.Dense(**config)(model.layers[-2].output)
new_model = tf.keras.Model(inputs=[model.input], outputs=[new_layer])
new_model.layers[-1].set_weights(weights)

assert new_model.layers[-1].activation == tf.keras.activations.linear

Turning off softmax in tensorflow models

2 Answers