0
votes

I’m trying to re-define keras’s binary_crossentropy loss function so that I can customize it but it’s not giving me the same results as the existing one.

I'm using TF 1.13.1 with Keras 2.2.4.

I went through Keras’s github code. My understanding is that the loss in model.compile(optimizer='adam', loss='binary_crossentropy', metrics =['accuracy']), is defined in losses.py, using binary_crossentropy defined in tensorflow_backend.py.

I ran a dummy data and model to test it. Here are my findings:

  • The custom loss function outputs the same results as keras’s one
  • Using the custom loss in a keras model gives different accuracy results
from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)

import tensorflow as tf
from keras import losses
import keras.backend as K
import keras.backend.tensorflow_backend as tfb
from keras.layers import Dense
from keras import Sequential

#Dummy check of loss output
def binary_crossentropy_custom(y_true, y_pred):
    return K.mean(binary_crossentropy_custom_tf(y_true, y_pred), axis=-1)

def binary_crossentropy_custom_tf(target, output, from_logits=False):
    """Binary crossentropy between an output tensor and a target tensor.

    # Arguments
        target: A tensor with the same shape as `output`.
        output: A tensor.
        from_logits: Whether `output` is expected to be a logits tensor.
            By default, we consider that `output`
            encodes a probability distribution.

    # Returns
        A tensor.
    """
    # Note: tf.nn.sigmoid_cross_entropy_with_logits
    # expects logits, Keras expects probabilities.
    if not from_logits:
        # transform back to logits
        _epsilon = tfb._to_tensor(tfb.epsilon(), output.dtype.base_dtype)
        output = tf.clip_by_value(output, _epsilon, 1 - _epsilon)
        output = tf.log(output / (1 - output))

    return tf.nn.sigmoid_cross_entropy_with_logits(labels=target,
                                                   logits=output)

logits = tf.constant([[-3., -2.11, -1.22],
                     [-0.33, 0.55, 1.44],
                     [2.33, 3.22, 4.11]])

labels = tf.constant([[1., 1., 1.], 
                      [1., 1., 0.], 
                      [0., 0., 0.]])

custom_sigmoid_cross_entropy_with_logits = binary_crossentropy_custom(labels, logits)
keras_binary_crossentropy = losses.binary_crossentropy(y_true=labels, y_pred=logits)

with tf.Session() as sess:
    print('CUSTOM sigmoid_cross_entropy_with_logits: ', sess.run(custom_sigmoid_cross_entropy_with_logits), '\n')
    print('KERAS keras_binary_crossentropy: ', sess.run(keras_binary_crossentropy), '\n')

#CUSTOM sigmoid_cross_entropy_with_logits:  [16.118095 10.886106 15.942386] 

#KERAS keras_binary_crossentropy:  [16.118095 10.886106 15.942386] 

#Dummy check of model accuracy

X_train = tf.random.uniform((3, 5), minval=0, maxval=1, dtype=tf.dtypes.float32)
labels = tf.constant([[1., 0., 0.], 
                      [0., 0., 1.], 
                      [1., 0., 0.]])

model = Sequential()
#First Hidden Layer
model.add(Dense(5, activation='relu', kernel_initializer='random_normal', input_dim=5))
#Output Layer
model.add(Dense(3, activation='sigmoid', kernel_initializer='random_normal'))

#I ran model.fit for each model.compile below 10 times using the same X_train and provide the range of accuracy measurement
# model.compile(optimizer='adam', loss='binary_crossentropy', metrics =['accuracy']) #0.748 < acc < 0.779
# model.compile(optimizer='adam', loss=losses.binary_crossentropy, metrics =['accuracy']) #0.761 < acc < 0.778
model.compile(optimizer='adam', loss=binary_crossentropy_custom, metrics =['accuracy']) #0.617 < acc < 0.663

history = model.fit(X_train, labels, steps_per_epoch=100, epochs=1)

I'd expect the custom loss function to give similar model accuracy output but it does not. Any idea? Thanks!

1

1 Answers

2
votes

Keras automatically selects which accuracy implementation to use according to the loss, and this won't work if you use a custom loss. But in this case you can just explictly use the right accuracy, which is binary_accuracy:

model.compile(optimizer='adam', loss=binary_crossentropy_custom, metrics =['binary_accuracy'])