Learning rate decay in TensorFlow - Error with piecewise_constant function

Question

I am trying to work with a code base for Tiny YOLO v2. I am running into the following error while declaring a learning rate schedule. I can see that my step values are the same size as my lr but am unsure what a good fix is. I have included my attempt at explicitly declaring the values (with steps smaller than lr) and the error that results in as well.

Error:

Traceback (most recent call last): File "scripts/train_tiny_yolo.py", line 335, in lr = tf.train.piecewise_constant(global_step, steps, lrs) File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/training/learning_rate_decay.py", line 147, in piecewise_constant name=name) File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/training/learning_rate_decay_v2.py", line 166, in piecewise_constant "The length of boundaries should be 1 less than the length of values") ValueError: The length of boundaries should be 1 less than the length of values

Here is the relevant section from my code:

    base_lr = params.get('learning_rate', 1e-3)
    steps = params.get('steps', [3000, 4000, 5000])

    steps_and_lrs = []
    if steps[0] > 100:
        # Warm-up
        steps_and_lrs += [
            (25, base_lr / 100),
            (50, base_lr / 10)
        ]

    steps_and_lrs += [(step, base_lr * 10**(-i)) for i, step in enumerate(steps)]
    steps, lrs = zip(*steps_and_lrs)

    # Alternative attempt to explicitly declare lr and steps values
    # steps =( 50, 20000, 30000, 40000)
    # lrs = (1e-05, 0.0001, 0.001, 0.0001, 1e-05)

    max_iter = steps[-1]
    lr = tf.train.piecewise_constant(global_step, steps, lrs)
    np.set_printoptions(precision=3, suppress=True)

    opt = tf.train.MomentumOptimizer(lr, momentum=0.9)
    grads_and_vars = opt.compute_gradients(loss)
    clip_value = params.get('clip_gradients')

    if clip_value is not None:
        grads_and_vars = [(tf.clip_by_value(g, -clip_value, clip_value), v) for g, v in grads_and_vars]

    train_op = opt.apply_gradients(grads_and_vars,
            global_step=global_step)

    merged = tf.summary.merge_all()

What have I tried:

When I give the values for steps and lr explicitly, I get the following value error:

Traceback (most recent call last): File "scripts/train_tiny_yolo.py", line 363, in grads_and_vars = [(tf.clip_by_value(g, -clip_value, clip_value), v) for g, v in grads_and_vars] File "scripts/train_tiny_yolo.py", line 363, in grads_and_vars = [(tf.clip_by_value(g, -clip_value, clip_value), v) for g, v in grads_and_vars] File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper return target(*args, **kwargs) File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/ops/clip_ops.py", line 69, in clip_by_value t = ops.convert_to_tensor(t, name="t") File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1039, in convert_to_tensor return convert_to_tensor_v2(value, dtype, preferred_dtype, name) File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1097, in convert_to_tensor_v2 as_ref=False) File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1175, in internal_convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 304, in _constant_tensor_conversion_function return constant(v, dtype=dtype, name=name) File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 245, in constant allow_broadcast=True) File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/constant_op.py", line 283, in _constant_impl allow_broadcast=allow_broadcast)) File "/Users/nivedithakalavakonda/Desktop/python_environments/objectdet_tf1/lib/python3.6/site-packages/tensorflow/python/framework/tensor_util.py", line 454, in make_tensor_proto raise ValueError("None values not supported.")

Currently using TensorFlow 1.13.1.

Any help is appreciated. Please let me know if sharing the large code base will be more insightful.

shreyas kamath shreyas kamath · Accepted Answer · 2019-07-16T12:59:52

According to your code, stepsand lrs are of the same size. Please check the example provided [here]. According to this documentation, the number of values in steps should be 1 less than the number of values in lrs. Also, please note, there is a bug in this scheduler. You can check it here.

If you are using tensorflow 2.0, below is an example that works. I have not tested this with tf 1.13.

import numpy as np
from tensorflow.python.keras.optimizer_v2 import learning_rate_schedule
n_step_epoch = 100
init_lr = 0.01
decay = 0.1

decay_type = 'multistep_15_25_100'
milestones = decay_type.split('_')
milestones.pop(0)
milestones = list(map(lambda x: int(x), milestones))
boundaries = np.multiply(milestones,n_step_epoch)
values = [init_lr] + [init_lr/(decay**-i) for i in  range(1,len(milestones)+1)]
learning_rate =learning_rate_schedule.PiecewiseConstantDecay(boundaries.tolist(), values)

Hope this helps!

Learning rate decay in TensorFlow - Error with piecewise_constant function

2 Answers