0
votes

My model has 100 000 training samples of images, how do I modify my code below to train it in batches? With model.fit_generator I have to specify this inside the generator function:

def data_generator(descriptions, features, n_step, max_sequence):
    # loop until we finish training
    while 1:
        # loop over photo identifiers in the dataset
        for i in range(0, len(descriptions), n_step):
            Ximages, XSeq, y = list(), list(),list()
            for j in range(i, min(len(descriptions), i+n_step)):
                image = features[j]
                # retrieve text input
                desc = descriptions[j]
                # generate input-output pairs
                in_img, in_seq, out_word = preprocess_data([desc], [image], max_sequence)
                for k in range(len(in_img)):
                    Ximages.append(in_img[k])
                    XSeq.append(in_seq[k])
                    y.append(out_word[k])
            # yield this batch of samples to the model
            yield [[array(Ximages), array(XSeq)], array(y)]

My model.fit_generator code:

model.fit_generator(data_generator(texts, train_features, 1, 150), 
                    steps_per_epoch=1500, epochs=50, callbacks=callbacks_list, verbose=1)

Any assistance would be great, I'm training on a cloud 16GB V100 Tesla

Edit: My image caption model creates a training sample for each token in the DSL(250 tokens). With a dataset of 50 images (equivalent to 12500 training samples) and a batch size of 1, I get an OOM. With about 32 (equivalent to 8000 samples and a batch size of 1 it trains just fine.) My question is can I optimize my code better, or is my only option to use multiple GPUs?

Fix:

Steps_per_epoch must be equal to ceil(num_samples / batch_size), so if the dataset has 1500 samples, steps_per_epoch should be equal to 1500. I also reduced my LSTM sliding window from 48 to 24

steps_per_epoch: Integer. Total number of steps (batches of samples) to yield from generator before declaring one epoch finished and starting the next epoch. It should typically be equal to ceil(num_samples / batch_size). Optional for Sequence: if unspecified, will use the len(generator) as a number of steps.

2
It's already in batches. Each yield is a batch.Daniel Möller
How do I control the batch size?Paul Gwamanda
If the answer below is not enough, you should explain your question properly. That answers exactly what you're asking.Daniel Möller
So are you saying that the code cannot be optimized. The only solution is to train in multiple V100 GPU's?Paul Gwamanda
What is your question? You asked how to change batch size.Daniel Möller

2 Answers

0
votes

The generators already return batches.

Every yield is a batch. It's totally up to you to design the generator with the batches the way you want.

In your code, the batch size is n_step.

0
votes

Here's the proper way of using generators: Make a generator that yields individual datums. Create a Dataset from it and use batch method on that object. Tune the parameter to find the largest batch size that won't cause an OOM.

def data_generator(descriptions, features, max_sequence):
    def _gen():
        for img, seq, word in zip(*preprocess_data(descriptions, features, max_sequence)):
            yield {'image': in_img, 'seq': seq}, wo
    return _gen    


ds = tf.data.Dataset.from_generator(
    data_generator(descriptions, features, max_sequence),
    output_types=({'image': tf.float32, 'seq': tf.float32}, tf.int32),
    output_shapes=({
            'image': tf.TensorShape([blah, blah]),
            'seq': tf.TensorShape([blah, blah]),
        },
        tf.TensorShape([balh])
    )
)

ds = ds.batch(n_step)