TensorFlow : Fine tuning only the Fully Connected Layer for Different learning rates with a single python file

Question

I am doing some experiments with transfer learning :
I have a script file where in 1st Part i train the model on a subset of mnist data set and then successfully save it . The model architecture consists of 2 C.N.N layers and 1 fully connected layer.

for epoch in range(1, params['epochs'] + 1):
                 shuffle = np.random.permutation(len(y_train))
                 x_train, y_train = x_train[shuffle], y_train[shuffle]


                 for i in range(0, len(y_train), params['batch_size']):
                     x_train_mb, y_train_mb = x_train[i:i + params['batch_size']], y_train[i:i + params['batch_size']]

                     sess.run(model.optimize, feed_dict={model.input: x_train_mb, model.target: y_train_mb, model.is_task1: True,
 model.is_train: True, model.learning_rate:
 temp_learning_rate_source_training})




                 valid_acc = classification_batch_evaluation(sess, model, model.metrics, params['batch_size'], True, x_valid, y=y_valid,
 stream=True)

                 print('valid [{} / {}] valid accuracy: {} learning Rate :{}'.format(epoch, params['epochs'] + 1,
 valid_acc,temp_learning_rate_source_training))
                 if valid_acc > initial_best_epoch['valid_acc']:
                     initial_best_epoch['epoch'] = epoch
                     initial_best_epoch['valid_acc'] = valid_acc
                     model.save_model(sess, epoch) 

                 if epoch - initial_best_epoch['epoch'] >= params['patience']:
                     print('Early Stopping Epoch: {}\n'.format(epoch))
                     logging.info('Early Stopping Epoch: {}\n'.format(epoch))
                     break


         print('Initial training done \n',file=f)
         logging.info('Initial training done \n')
         sess.close()      

     model.restore_model(sess) ##Restores the model after creating it .

Now i want to do transfer learning by keeping the architecture same and transferring the parameters for the C.N.N layers and re-initializing the fully connected layer . And then training the 3 layers again for the limited new data set by using different learning rates and "decay_after_epoch" to analyze the result . Now because of high number of combinations i have written 2 for - loops to automate the process as follows :

for temp_learning_rate_target_training in (0.001,0.005,0.01):

        for decay_after_epoch in (3,5,10): 
            learning_rate = temp_learning_rate_target_training
            model.restore_model(sess) ##Restores the model after creating it .
            with open("/home/abhishek/Desktop/{}_{}_{}.txt".format(params["dataset"],params["k"],params["n"])) as f1:
                with open("/home/abhishek/Desktop/{}_{}_{}_{}_{}.txt".format(params["dataset"],params["k"],params["n"],temp_learning_rate_target_training,decay_after_epoch), "w") as f:
                    for x in f1.readlines():
                        f.write(x)
                    print("Target Training Begins",file=f)
                    for epoch in range(1, params['epochs'] + 1):
                        shuffle = np.random.permutation(len(y_train2))
                        x_train2, y_train2 = x_train2[shuffle], y_train2[shuffle]


                        if epoch%decay_after_epoch==0 and epoch <=decay_after_epoch:
                            learning_rate = learning_rate *0.1
                        elif (epoch-decay_after_epoch)%30==0:
                            learning_rate = learning_rate *0.1



                        for i in range(0, len(y_train2), params['batch_size']):
                            x_train_mb, y_train_mb = x_train2[i:i + params['batch_size']], y_train2[i:i + params['batch_size']]
                            sess.run(model.optimize, feed_dict={model.input: x_train_mb, model.target: y_train_mb, model.is_task1: False, model.is_train: True, model.learning_rate: params['learning_rate']})

                        train_acc = classification_batch_evaluation(sess, model, model.metrics, params['batch_size'], False, x_train2, y=y_train2, stream=True)
                        sess.close()

                        print('train [{} / {}] train accuracy: {} learning Rate:{} '.format(epoch, params['epochs'] + 1, train_acc,learning_rate),file=f)
                        print('train [{} / {}] train accuracy: {} learning Rate :{}'.format(epoch, params['epochs'] + 1, train_acc,learning_rate))
                        logging.info('train [{} / {}] train accuracy: {}'.format(epoch, params['epochs'] + 1, train_acc))

                        if train_acc > transfer_best_epoch['train_acc']:
                            transfer_best_epoch['epoch'] = epoch
                            transfer_best_epoch['train_acc'] = train_acc
                            test_acc = classification_batch_evaluation(sess, model, model.metrics, params['batch_size'], False, x_test2, y=y_test2, stream=True)
                            transfer_best_epoch['test_acc'] = test_acc

                        if epoch % params['patience'] == 0:
                            acc_diff = transfer_best_epoch['train_acc'] - es_acc
                            if acc_diff < params['percentage_es'] * es_acc:
                                print('Early Stopping Epoch: {}\n'.format(epoch))
                                logging.info('Early Stopping Epoch: {}\n'.format(epoch))
                                break
                            es_acc = transfer_best_epoch['train_acc']

                    print('Transfer training done \n',file=f)
                    print('TARGET test accuracy: {}'.format(transfer_best_epoch['test_acc']),file=f)

Now after the first loop is run with temp_learning_rate_target_training =0.0001 and decay_after_epoch = 3 the model is trained , and i have the test accuracy , and let the weights and biases for different(3) layers are given by a set S2 . Now when the loop is run again , the parameter model.is_task1: False makes sure that the fully connected layer is re-initialized , but the parameters of the C.N.N layers are copied over from the set S2 . (Why i am saying this is because i am getting exact same accuracy logs for all the combination of learning Rates and decay_after_epoch ). However , i want to train different loops with same initial parameters for the C.N.N layers which are given by S1
I have tried closing the session with sess.close() , after each loop and then restoring the saved model(which was trained in part1 of the code) with model.restore_model(sess) but it still does not give the expected result. How should i be proceeding ?

Instead of the code of the training loops, you should rather post the code of how you create the model object. — Jindřich

Jindřich Jindřich · Accepted Answer · 2019-09-20T08:39:38

If you need to tweak your models like this, you need to be aware of what variables you want to train.

All layers should have their variables in their variable scope. You can get the variables:

tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='my_scope')

For variables you want to re-initialize, you can do just:

sess.run([v.initializer for v in variables_to_reset])

When you initializer the minimize method (it is hidden in your model object, this what gets you the model.optimize op that you call in your training loop), you can specify var_list which is the list of variables you want to train, the rest will remain intact.

TensorFlow : Fine tuning only the Fully Connected Layer for Different learning rates with a single python file

1 Answers