Problems in reinforcement learning: bug, parameters tuning, and training period

Question

I am currently training a reinforcement learning agent using a simple Neural Network with 100 hidden elements to solve 2048 game. I am using DQN's reinforcement learning algorithm (i.e. Q-learning with replay memory), but with 2 layers Neural Network instead of Deep Neural Network.

However, I left it trained on my laptop overnight (~7 hours, ~1000 games played, > 100000 steps) and the score does not seem to increase. I suspect there might be 3 sources of errors in my code: bug, parameters tuned badly, or maybe I just don't wait long enough.

Is there any method to figure out what is wrong with the code? And what is the best practice to improve the training results?

malreddysid malreddysid · Accepted Answer · 2016-07-12T13:09:56

I'll talk about all three of your hypothesis.

If you are using a standard DL framework like caffe or tensorflow, the chance of it being a bug is small.
Try decreasing the learning rate. Maybe you set it too high for the network to converge.
The training time of 100000 steps is not that long. For a simple pong game, you need to train around 500000 steps to get a good accuracy. So you can try training it for longer.

Also, 2048 is a fairly complicated game, so maybe you network is not deep enough to learn how to play it. Two layers is not much for such a complicated game. Try increasing the number of hidden layers. Perhaps you can use the network provided here

Problems in reinforcement learning: bug, parameters tuning, and training period

1 Answers