
Deep Learning has been applied successfully on several large data sets for the classification of a handful of classes (cats, dogs, cars, planes, etc), with performances beating simpler descriptors like Bags of Features over SIFT, color histograms, etc.

Nevertheless, training such a network requires a lot of data per class and a lot of training time. However, very often one doesn't have enough data or just wants to get an idea of how well a convolutional neural network might do, before spending time one designing and training such a device and gathering the training data.

In this particular case, it might be ideal to have a network configured and trained using some benchmark data set used by the state of the art publications, and to simply apply it to some data set that you might have as a feature extractor.

This results in a set of features for each image, which one could feed to a classical classification method like SVM's, logistic regression, neural networks, etc.

In particular when one does not have enough data to train the CNN, I may expect this to outperform a pipeline where the CNN was trained on few samples.

I was looking at the tensorflow tutorials, but they always seem to have a clear training / testing phase. I couldn't find a pickle file (or similar) with a pre-configured CNN feature extractor.

My questions are: do such pre-trained networks exist and where can I find them. Alternatively: does this approach make sense? Where could I find a CNN+weights ?

EDIT W.r.t. @john's comment I tried using 'DecodeJpeg:0' and 'DecodeJpeg/contents:0' and checked the outputs, which are different (:S)

import cv2, requests, numpy
import tensorflow.python.platform
import tensorflow as tf

response = requests.get('https://i.stack.imgur.com/LIW6C.jpg?s=328&g=1')
data = numpy.asarray(bytearray(response.content), dtype=np.uint8)
image = cv2.imdecode(data,-1)

compression_worked, jpeg_data = cv2.imencode('.jpeg', image)
if not compression_worked:
    raise Exception("Failure when compressing image to jpeg format in opencv library")
jpeg_data = jpeg_data.tostring()

with open('./deep_learning_models/inception-v3/classify_image_graph_def.pb', 'rb') as graph_file:
    graph_def = tf.GraphDef()
    tf.import_graph_def(graph_def, name='')

with tf.Session() as sess:
    softmax_tensor = sess.graph.get_tensor_by_name('pool_3:0')

    arr0 = numpy.squeeze(sess.run(
        {'DecodeJpeg:0': image}

    arr1 = numpy.squeeze(sess.run(
        {'DecodeJpeg/contents:0': jpeg_data}

    print(numpy.abs(arr0 - arr1).max())

So the max absolute difference is 1.27649, and in general all the elements differ (especially since the average value of the arr0 and arr1 themselves lies between 0 - 0.5).

I also would expect that 'DecodeJpeg:0' needs a jpeg-string, not a numpy array, why else does the name contain 'Jpeg'. @john: Could you state how sure you are about your comment?

So I guess I'm not sure what is what, as I would expect a trained neural network to be deterministic (but chaotic at most).

Rubber duck: when googleing on "CNN trained on ImageNet", I found this: vlfeat.org/matconvnet/pretrainedHerbert
I could compile and run this network on my laptop -, and use the webcam to identify/classify images github.com/sermanet/OverFeatAlex Punnen

1 Answers


The TensorFlow team recently released a deep CNN trained on the ImageNet dataset. You can download the script that fetches the data (including the model graph and the trained weights) from here. The associated Image Recognition tutorial has more details about the model.

While the current model isn't specifically packaged to be used in a subsequent training step, you could explore modifying the script to reuse parts of the model and the trained weights in your own network.