1
votes

I am trying to construct a custom loss function in Keras - this is for use with censored data in Survival analysis.

This loss function is essentially binary cross-entropy i.e. multi-label classification, however the summation term within the loss function needs to vary based on the availability of labels from Y_true. See example below :

Example 1: All Labels Available For Y_True

Y_true = [0, 0, 0, 1, 1]

Y_pred = [0.1, 0.2, 0.2, 0.8, 0.7]

Loss = -1/5(log(0.9) + log(0.8) + log(0.8) + log(0.8) + log(0.7)) = 0.22

Example 2: Only Two Labels Available For Y_True

Y_true = [0, 0, -, -, -]

Y_pred = [0.1, 0.2, 0.1, 0.9, 0.9]

Loss = -1/2 (log(0.9) + log(0.8)) = 0.164

Example 3: Only One Label Available For Y_True

Y_true = [0, -, -, -, -]

Y_pred = [0.1, 0.2, 0.1, 0.9, 0.9]

Loss = -1 (log(0.9)) = 0.105

enter image description here

In the case of example one, our loss would be calculated via the formula above with K = 5. In example two, our loss would be calculated with K = 2 (i.e. only evaluating against the first two terms that are available in the ground truth). The loss function needs to adjust based on Y_true availability.

I have had an attempt at the custom Keras loss function...however I am struggling on how to filter based on nan index within tensor flow. Does anyone have an suggestions for coding the custom loss function described above?

def nan_binary_cross_entropy(y_actual, y_predicted):
    stack = tf.stack((tf.is_nan(y_actual), tf.is_nan(y_predicted)),axis=1)
    is_nans = K.any(stack, axis=1)
    per_instance = tf.where(is_nans, tf.zeros_like(y_actual), 
                            tf.square(tf.subtract(y_predicted, y_actual)))

    FILTER HERE
    return K.binary_crossentropy(y_filt, y_filt), axis=-1)
1
Is this the same loss function?gobrewers14
Unfortunately not, I am essentially trying to write the loss function such that the number of labels considered varies based on number of entries in the ground truth! Thanks though...!Mike Tauber
I see. You want to do binary crossentropy where the number of available labels can vary from training instance to training instance (correct?). Can you post a example where you do the calculation by hand to show the desired output?gobrewers14
Correct ! Ok let me try add an example :)Mike Tauber
Ok added another example also :)Mike Tauber

1 Answers

1
votes

You can use a combination of tf.math.is_nan and tf.math.multiply_no_nan to mask your y_true to get the desired result.

import numpy as np
import tensorflow as tf


y_true = tf.constant([
    [0.0, 0.0, 0.0, 1.0, 1.0],
    [0.0, 0.0, np.nan, np.nan, np.nan],
    [0.0, np.nan, np.nan, np.nan, np.nan],
])


y_pred = tf.constant([
    [0.1, 0.2, 0.2, 0.8, 0.7],
    [0.1, 0.2, 0.1, 0.9, 0.9],
    [0.1, 0.2, 0.1, 0.9, 0.9],
])


def survival_loss_fn(y_true, y_pred):
    # create a mask for NaN elements
    mask = tf.cast(~tf.math.is_nan(y_true), tf.float32)
    # sum along the row axis of the mask to find the `N`
    # for each training instance
    Ns = tf.math.reduce_sum(mask, 1)
    # use `multiply_no_nan` to zero out the NaN in `y_pred`
    fst = tf.math.multiply_no_nan(y_true, mask) * tf.math.log(y_pred)
    snd = tf.math.multiply_no_nan(1.0 - y_true, mask) * tf.math.log(1.0 - y_pred)
    return -tf.math.reduce_sum(fst + snd, 1) / Ns


survival_loss_fn(y_true, y_pred)
# <tf.Tensor: shape=(3,), [0.22629324, 0.16425204, 0.10536055], dtype=float32)>