2
votes

To classify images, we are using a neural network with a few convolutional layers followed by a few fully-connected layers.

The metadata has some numerical information that could help classifying the images. Is there an easy way to input the numerical metadata into the first fully-connected layer, together with the output of the convolutions? Is it possible to implement this using TensorFlow, or even better Keras?

2
You can use the following idea: after having passed through the CNN, your image is transformed into a flat list of numbers that's ready to be fed into the ANN. At this point, you can append to this flat list any metadata you want (as long as the metadata is a list of numbers too) and feed this longer list into the ANN. - ForceBru

2 Answers

2
votes

You may process the numerical data in another branch and then merge the result with the CNN branch and then pass the merged tensor to a few final dense layers. Here is a general sketch of the solution:

# process image data using conv layers
inp_img = Input(shape=...)
# ...

# process numerical data
inp_num = Input(shape=...)
x = Dense(...)(inp_num)
out_num = Dense(...)(x)

# merge the result with a merge layer such as concatenation
merged = concatenate([out_conv, out_num])
# the rest of the network ...

out = Dense(num_classes, activation='softmax')(...)

# create the model
model = Model([inp_img, inp_num], out)

Surely, to build such a model you need to use Keras Functional API. Therefore, I highly recommend to read the official guide for this purpose.

1
votes

Is there an easy way to input the numerical metadata into the first fully-connected layer, together with the output of the convolutions?

Yes, it is possible. You need two inputs for numerical metadata and images.

inp1 = Input(28,28,1) # image
inp2 = Input(30,) # numerical metadata (assume size of numerical feature is 30)

conv2d = Convolution2D(100,strides=1,padding='same')(inp1)
embedding = Embedding(1000)(inp2)

# ... rest of the network
prev_layer = Concatenation(axis=-1)[feature_image, feature_metadata]            
prediction = Dense(100)(prev_layer)

model = Model(inputs=[inp1, inp2], outputs=prediction)

See complete example in keras here.