I am new to tensorflow and Machine Learning. Recently I am working on a model. My model is like below,
Character level Embedding Vector -> Embedding lookup -> LSTM1
Word level Embedding Vector->Embedding lookup -> LSTM2
[LSTM1+LSTM2] -> single layer MLP-> softmax layer
[LSTM1+LSTM2] -> Single layer MLP-> WGAN discriminator
Code of he rnn model
while I'm working on this model I got the following error. I thought My batch is too big. Thus I tried to reduce the batch size from 20 to 10 but it doesn't work.
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[24760,100] [[Node: chars/bidirectional_rnn/bw/bw/while/bw/lstm_cell/split = Split[T=DT_FLOAT, num_split=4, _device="/job:localhost/replica:0/task:0/device:GPU:0"](gradients_2/Add_3/y, chars/bidirectional_rnn/bw/bw/while/bw/lstm_cell/BiasAdd)]] [[Node: bi-lstm/bidirectional_rnn/bw/bw/stack/_167 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_636_bi-lstm/bidirectional_rnn/bw/bw/stack", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"]]
tensor with shape[24760,100] means 2476000*32/8*1024*1024 = 9.44519043 MB memory. I am running the code on a titan X(11 GB) GPU. What could go wrong? Why this type of error occurred?
* Extra info *: the size of the LSTM1 is 100. for bidirectional LSTM it becomes 200. The size of the LSTM2 is 300. For Bidirectional LSTM it becomes 600.
*Note *: The error occurred after 32 epoch. My question is why after 32 epoch there is an error. Why not at the initial epoch.