I am currently training a reinforcement learning agent using a simple Neural Network with 100 hidden elements to solve 2048 game. I am using DQN's reinforcement learning algorithm (i.e. Q-learning with replay memory), but with 2 layers Neural Network instead of Deep Neural Network.
However, I left it trained on my laptop overnight (~7 hours, ~1000 games played, > 100000 steps) and the score does not seem to increase. I suspect there might be 3 sources of errors in my code: bug, parameters tuned badly, or maybe I just don't wait long enough.
Is there any method to figure out what is wrong with the code? And what is the best practice to improve the training results?