1
votes

I'm implementing Andrew Ng's Coursera course in Python and I'm doing Ex2 right now, Logistic Regression. I'm trying to use SciPy's optimize.minimize but I can't seem to get it to run correctly. I'll try to give as brief a summary of my code as possible while being thorough. I'm using Python3. Here is my variable setup, I move everything to numpy after using pandas to read in the csv file:

import numpy as np
import pandas as pd
from scipy.optimize import fmin_bfgs
from scipy import optimize as opt
from scipy.optimize import minimize

class Ex2:
    def __init__(self):
        self.pandas_data = pd.read_csv("ex2data1.txt", skipinitialspace=True)
        self.data = self.pandas_data.values

        self.data = np.insert(self.data, 0, 1, axis=1)
        self.x = self.data[:, 0:3]
        self.y = self.data[:, 3:]
        self.theta = np.zeros(shape=(self.x.shape[1]))

x: (100, 3) numpy ndarray

y: (100, 1) numpy ndarray

theta: (3,) numpy ndarray (1-d)

Then, I define a sigmoid, cost and gradient function to give to Scipy's minimize:

    @staticmethod
    def sigmoid(x):
        return 1/(1 + np.exp(x))

    def cost(self, theta):
        x = self.x
        y = self.y
        m = len(y)
        h = self.sigmoid(x.dot(theta))
        j = (1/m) * ((-y.T.dot(np.log(h))) - ((1-y).T.dot(np.log(1-h))))
        return j[0]

    def grad(self, theta):
        x = self.x
        y = self.y
        theta = np.expand_dims(theta, axis=0)
        m = len(y)
        h = self.sigmoid(x.dot(theta.T))
        grad = (1/m) * (x.T.dot(h-y))
        grad = np.squeeze(grad)
        return grad

These take theta, a 1-D numpy ndarray. Cost returns a scalar (the cost associated with the theta given) and gradient returns a 1-D numpy ndarray of updates for theta.

When I then run this code:

    def run(self):
        options = {'maxiter': 100}
        print(minimize(self.cost, self.theta, jac=self.grad, options=options))


ex2 = Ex2()
ex2.run()

I get:

fun: 0.69314718055994529

hess_inv: array([[1, 0, 0],

[0, 1, 0],

[0, 0, 1]])

jac: array([ -0.1 , -12.00921659, -11.26284221])

message: 'Desired error not necessarily achieved due to precision loss.'

nfev: 106

nit: 0

njev: 94

status: 2

success: False

x: array([ 0., 0., 0.])

Process finished with exit code 0

Can't quite get the formatting right on the output, apologies. That's the gist of what I'm doing, am I returning something from cost or gradient incorrectly? That seems most likely to me but I've been trying various combinations and formats of return values and nothing seems to work. Any help is greatly appreciated.

Edit: Among other things, to debug this I've made sure that cost and grad are returning what I expect, which they are (cost: float, grad: 1-D ndarray). Running both on an initial theta array of zeros gives me the same values as I get in Octave (which I know to be correct thanks to the provided code for the exercises). However, giving these values to the minimize function does not seem to be minimizing the theta values as expected.

1

1 Answers

3
votes

If anyone stumbles across this and happens to have the same problem, I figured out that in my sigmoid function I should have had

return 1/(1 + np.exp(-x))

but had

return 1/(1 + np.exp(x))

After fixing that, the minimize function converged normally.