2
votes

I'd like to train a neural network in Python and Keras using a metric learning custom loss function. The loss minimizes the distances of the outputs for similar inputs and maximizes the distances between dissimilar ones. The part considering similar inputs is:

# function to create a pairwise similarity matrix, i.e
# L[i,j] == 1 for similar samples i, j and 0 otherwise
def build_indicator_matrix(y_, thr=0.1):
    # y_: contains the labels of the samples,
    # samples are similar in case of same label

    # prevent checking equality of floats --> check if absolute
    # differences are below threshold
    lbls_diff = K.expand_dims(y_, axis=0) - K.expand_dims(y_, axis=1)
    lbls_thr = K.less(K.abs(lbls_diff), thr)
    # cast bool tensor back to float32
    L = K.cast(lbls_thr, 'float32')

    # POSSIBLE WORKAROUND
    #L = K.sum(L, axis=2)

    return L

# function to compute the (squared) Euclidean distances between all pairs
# of samples, store in DIST[i,j] the distance between output y_pred[i,:] and y_pred[j,:]
def compute_pairwise_distances(y_pred):
    DIFF = K.expand_dims(y_pred, axis=0) - K.expand_dims(y_pred, axis=1)
    DIST = K.sum(K.square(DIFF), axis=-1)    
    return DIST

# function to compute the average distance between all similar samples
def my_loss(y_true, y_pred):
    # y_true: contains true labels of the samples
    # y_pred: contains network outputs

    L = build_indicator_matrix(y_true)    
    DIST = compute_pairwise_distances(y_pred)
    return K.mean(DIST * L, axis=1)

For training, I pass a numpy array y of shape (n,) as target variable to my_loss. However, I found (using the computational graph in TensorBoard) that the tensorflow backend creates a 2D variable out of y (displayed shape ? x ?), and hence L in build_indicator_matrix is not 2 but 3-dimensional (shape ? x ? x ? in TensorBoard). This causes net.evaulate() and net.fit() to compute wrong results.

Why does tensorflow create a 2D rather than a 1D array? And how does this affect net.evaluate() and net.fit()?

As quick workarounds I found that either replacing the build_indicator_matrix() with static numpy code for computing L , or collapsing the "fake" dimension with the line L = K.sum(L, axis=2) solves the problem. In the latter case, however, the output of K.eval(build_indicator_matrix(y)) is of only of shape (n,) and not (n,n), so I do not understand why this workaround still yields correct results. Why does tensorflow introduce an additional dimension?

My library versions are:

  • keras: 2.2.4
  • tensorflow: 1.8.0
  • numpy: 1.15.0
1

1 Answers

0
votes

This is because evaluate and fit work in batches. The first dimension you see in tensorboard is the batch dimension, unknown in advance and therefore denoted ?. When using custom metrics, remember the tensors (y_true and y_pred) you get are the ones corresponding to the batch.

For more info, show us how you call both those functions.