I am interested in the effective learning rate of Adam. We know that Adam is roughly formed by a initial/constant learning rate divided by tthe sum of the past gradients of the loss (see here for details). The matter of the question is that it has an adaptive contribution which acts on a constant initial learning rate.
Starting from the optimizer definition:
my_optimizer = tf.keras.optimizers.Adam(initial_learning_rate, beta_1 = my_beta_1, beta_2 = my_beta_2)
using the following lines we can easily print the constant part of the Adam learning rate.
my_optimizer.learning_rate
my_optimizer.lr
keras.backend.get_value(my_optimizer.lr)
Or we can modify the learning rate value through:
keras.backend.set_value(my_optimizer.lr, my_new_learning_rate)
These expressions work well with fixed learning rate optimizers like the stochastic gradient descent.
There is this question in which zihaozhihao proposed to directly calculate the value of the learning rate using the definition of Adam. I was looking for an easier way, just like the expressions mentioned above since, as I said in the question title, I want to both to print and to modify the effective learning rate.
My question is: what is the tensorflow function which gives you the access to the value of the effective learning rate of Adam?
Printing because I want to monitor and modifying because I want to add constraints to its variations since Adam can sometimes be unstable (due to the fact it is adaptive).