Python regularized gradient descent for logistic regression

Question

I'm trying to implement Gradient Descent (GD) (not stochastic one) for logistic regression in Python 3x. And have some troubles.

Logistic regression is defined as follows (1): logistic regression formula

Formulas for gradients are defined as follows (2): gradient descent for logistic regression

Description of data:

X is (Nx2)-matrix of objects (consist of positive and negative float numbers)
y is (Nx1)-vector of class labels (-1 or +1)

Task: Implement gradient descent 1) with L2-regularization; and 2) without regularization. Desired results: vectors of weights. Parameters: regularization rate C=10 for regularized regression and C=0 for unregularized regression; gradient step k=0.1; max.number of iterations = 10000; tolerance = 1e-5. Note: GD is converged if distance between weighs vectors from current and previous steps is less than tolerance (1e-5).

Here is my implementation: k - gradient step; C - regularization rate.

import numpy as np

def sigmoid(z):
    result = 1./(1. + np.exp(-z))
    return result

def distance(vector1, vector2):
    vector1 = np.array(vector1, dtype='f')    
    vector2 = np.array(vector2, dtype='f')
    return np.linalg.norm(vector1-vector2)

def GD(X, y, C, k=0.1, tolerance=1e-5, max_iter=10000):

    X = np.matrix(X)
    y = np.matrix(y)
    l=len(X)
    w1, w2 = 0., 0.  # weights (look formula (2) in the beginning of question)
    difference = 1.
    iteration = 1

    while(difference > tolerance):

        hypothesis = y*(X*np.matrix([w1, w2]).T)

        w1_updated = w1 + (k/l)*np.sum(y*X[:,0]*(1.-(sigmoid(hypothesis)))) - k*C*w1
        w2_updated = w2 + (k/l)*np.sum(y*X[:,1]*(1.-(sigmoid(hypothesis)))) - k*C*w2

        difference = distance([w1, w2], [w1_updated, w2_updated])
        w1, w2 = w1_updated, w2_updated
        if(iteration >= max_iter):
            break;

        iteration = iteration + 1

    return [w1_updated, w2_updated]  #vector of weights

Respectively:

# call for UNregularized GD: C=0
w = GD(X, y, C=0., k=0.1)

and

# call for regularized GD: C=10
w_reg = GD(X, y, C=10., k=0.1)

Here are the resuls (weights-vectors):

# UNregularized GD
[0.035736331265589463, 0.032464572442830832]

# regularized GD
[5.0979561973044096e-06, 4.6312243707352652e-06]

However, it should be (right answers for self-control):

# UNregularized GD
[0.28801877, 0.09179177]

# regularized GD
[0.02855938, 0.02478083]

!!! Please, can you tell me whats going wrong here? I'm sitting with this problem for three days in a row and still have no idea.

Thank you in advance.

Sim1bet Sim1bet · Accepted Answer · 2018-02-26T17:05:55

First of all, the sigmoid functions should be

def sigmoid(Z):
   A=1/(1+np.exp(-Z))
   return A

Try to run it again with this formula. Then, what is L?

Python regularized gradient descent for logistic regression

1 Answers