i'm trying to train a simple MLP with a own dataset in Python with Keras. This dataset includes normalized images in a size of 1024 x 1204, i need this resolution for therefore i can't decrease the size of the images. I use a Tesla V100 with 16GB for the training.
My aim is first of all, that something work, before i can tune this model (make a cnn etc.), but actually it does not, because of:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1048576,4096] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
This error occurs at the first layer, so before the training realy begins.
I trained alreay a MLP in Julia with Flux without memory problems.
Everything i tried:
- reduced the batch size
- using multiple gpus with (keras.utils.multi_gpu_model), the issue occurs before several gpus are in use
- reduced neurons of input layer (to reduce the weights matrix) from 1024*1024 to 4096
- set allow_growth and also tried the per_process_gpu_memory_fraction
MLP in julia (flux)
m = Chain(
Dense(1024*1024, 1024, relu),
Dense(1024, 256, relu),
Dense(256, 2),
softmax) |> gpu
MLP in python (keras)
model = Sequential()
model.add(Dense(4*1024, input_shape=(1024*1024,)))
model.add(Dense(1024, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))