
Let's say you have a folder where your dataset is ( a lot of images ). You want to feed these images to a Deep neural network for training ( for me I use Tensorflow for now).

The first solution that comes to the mind ( very unclassy and beginner solution ) is to store the images in an array. This is Ok for a small dataset but when the dataset is big and pictures are big this is not a viable solution because you we'll not have enough memory.

The solution is to read data in batches.

I m trying to implement that. The dataset I m interested in is the cultech's Caltech-UCSD Birds 200. The dataset is provide with a text file in which each line contains the path to each image. This facilitates things. My solution ( That I m trying to implement ) is to define a class. The template is :

class Dataset : 
          attributes : 
          methodes : 

As soon as I instanciate an object of this class, the paths to all images are stored in the variable images_paths and the ground_truth labels are stored in the labels ( one_hot_encoded). The method get_next_batch() will use the current_batch_index to return ana array where the we store the actual images using the paths. The size of the array is the batch_size and the index read from the images_path and labes are ( current_batch_index,current_batch_index+batch_size). ( I read images using scipy.misc.imread and reshape them to a fixed shape ( 200x200 ) usig scipy.misc.reshape ).

This way I ll use the object to store only a batch in memory and use it in the training loop to feed it to the network.

Questions : what do you think of this ? how do you feed your images to the netword normally ? Are there tools to that ? are there tools to split your dataset?

F.Y.I : I m using python and tensorflow. Would be interesting to know the answers for these questions for C++ too.

THank you and sorry for the long post


1 Answers


Tensorflow allows reading the data from disk in batches as needed, and has methods for buffering the data ahead of time to reduce latency (viz. whilst batch 3 is running through the network, batch 4 is sitting in memory and batch 5 is being loaded into memory where batch 2 used to be.) Check out the tf.data library. The cifar10 example does something like you are asking, but cifar10 is in a weird format, so some adjustment is in order.

Anybody have a better example?