I am trying to construct a custom loss function in Keras - this is for use with censored data in Survival analysis.
This loss function is essentially binary cross-entropy i.e. multi-label classification, however the summation term within the loss function needs to vary based on the availability of labels from Y_true. See example below :
Example 1: All Labels Available For Y_True
Y_true = [0, 0, 0, 1, 1]
Y_pred = [0.1, 0.2, 0.2, 0.8, 0.7]
Loss = -1/5(log(0.9) + log(0.8) + log(0.8) + log(0.8) + log(0.7)) = 0.22
Example 2: Only Two Labels Available For Y_True
Y_true = [0, 0, -, -, -]
Y_pred = [0.1, 0.2, 0.1, 0.9, 0.9]
Loss = -1/2 (log(0.9) + log(0.8)) = 0.164
Example 3: Only One Label Available For Y_True
Y_true = [0, -, -, -, -]
Y_pred = [0.1, 0.2, 0.1, 0.9, 0.9]
Loss = -1 (log(0.9)) = 0.105
In the case of example one, our loss would be calculated via the formula above with K = 5. In example two, our loss would be calculated with K = 2 (i.e. only evaluating against the first two terms that are available in the ground truth). The loss function needs to adjust based on Y_true availability.
I have had an attempt at the custom Keras loss function...however I am struggling on how to filter based on nan index within tensor flow. Does anyone have an suggestions for coding the custom loss function described above?
def nan_binary_cross_entropy(y_actual, y_predicted):
stack = tf.stack((tf.is_nan(y_actual), tf.is_nan(y_predicted)),axis=1)
is_nans = K.any(stack, axis=1)
per_instance = tf.where(is_nans, tf.zeros_like(y_actual),
tf.square(tf.subtract(y_predicted, y_actual)))
FILTER HERE
return K.binary_crossentropy(y_filt, y_filt), axis=-1)