0
votes

I am trying to apply Newton's Method to a gradient descent algorithm with backtracking.

Gradient Descent algorithm: enter image description here

Gradient Descent with Backtracking algorithm: enter image description here

Newton's Method: enter image description here

    import numpy as np
from scipy import optimize as opt

def newton_gd_backtracking(w,itmax,tol):
    # You may set bounds of "learnrate"
    max_learnrate =0.1
    min_learnrate =0.001

    for i in range(itmax):
        grad = opt.rosen_der(w)
        grad2 = (np.linalg.norm(grad))**2
        hess = opt.rosen_hess(w)

        # you have to decide "learnrate"
        learnrate = max_learnrate
        while True:
            f0 = opt.rosen(w)
            f1 = opt.rosen(w - learnrate * grad)
            if f1 <= (f0 - (learnrate/2)*grad2):
                break
            else:
                learnrate /=2;
            if learnrate< min_learnrate:
                learnrate = min_learnrate; break

        # now, Newton's method
        deltaw = - learnrate * np.linalg.inv(hess) * grad 
        w = w + deltaw

        if np.linalg.norm(deltaw) < tol:
            break

    return w, i, learnrate


# You can call the above function, by adding main

if __name__=="__main__":
    w0 = np.array([0,0])
    
    itmax = 10000; tol = 1.e-5

    w, i, learnrate = newton_gd_backtracking(w0,itmax,tol)
    print('Weight: ', w)
    print('Iterations: ', i)
    print('Learning Rate: ', learnrate)

After running the program, I get these error messages:

TypeError: only size-1 arrays can be converted to Python scalars

The above exception was the direct cause of the following exception:

Traceback (most recent call last):

File "c:/Users/Desfios 5/Desktop/Python/homework submission/Deyeon/GD_BT_Newton/Main_Newton_GD_Backtracking.py", line 43, in w, i, learnrate = newton_gd_backtracking(w0,itmax,tol)

File "c:/Users/Desfios 5/Desktop/Python/homework submission/Deyeon/GD_BT_Newton/Main_Newton_GD_Backtracking.py", line 12, in newton_gd_backtracking hess = opt.rosen_hess(w)

File "C:\Users\Desfios 5\AppData\Roaming\Python\Python38\site-packages\scipy\optimize\optimize.py", line 373, in rosen_hess diagonal[0] = 1200 * x[0]**2 - 400 * x1 + 2

ValueError: setting an array element with a sequence.

When I run this without the Hessian, as a normal gradient descent with backtracking, the code works fine. Here is the code that I used for normal gradient descent with backtracking:

    import numpy as np
from scipy import optimize as opt

def gd_backtracking(w,itmax,tol):
    # You may set bounds of "learnrate"
    max_learnrate =0.1
    min_learnrate =0.001

    for i in range(itmax):
        grad = opt.rosen_der(w)
        grad2 = (np.linalg.norm(grad))**2

        # you have to decide "learnrate"
        learnrate = max_learnrate
        while True:
            f0 = opt.rosen(w)
            f1 = opt.rosen(w - learnrate * grad)
            if f1 <= (f0 - (learnrate/2)*grad2):
                break
            else:
                learnrate /=2;
            if learnrate< min_learnrate:
                learnrate = min_learnrate; break

        # now, march
        deltaw = - learnrate * grad
        w = w + deltaw

        if np.linalg.norm(deltaw) < tol:
            break

    return w, i, learnrate


# You can call the above function, by adding main

if __name__=="__main__":
    w0 = np.array([0,0])
    
    itmax = 10000; tol = 1.e-5

    w, i, learnrate = gd_backtracking(w0,itmax,tol)
    print('Weight: ', w)
    print('Iterations: ', i)
    print('Learning Rate: ', learnrate)

Is there something about the hessian matrix that I don't know about? To my understanding, the opt.rosen_hess should yield us a 1D array just like the opt.rosen_der. Perhaps I'm using opt.rose_hess in a wrong way. What am I missing here?

1

1 Answers

0
votes

After going through my work over and over, I realized my error. Under newton_gd_backtesting, I was multiplying the inverse hessian and the gradient. These aren't scalars so I should have done dot products. Once I use the dot products, I got the desired results.