Stochastic Gradient Descent for Logistic Regression always returns a cost of Inf and weight vector never gets any closer

Question

I am trying to implement a logistic regression solver in MATLAB and i am finding the weights by stochastic gradient descent. I am running into a problem where my data seems to produce an infinite cost, and no matter what happens it never goes down... Both these seem perfectly fine, i cant imagine why my cost function would ALWAYS return infinite.

Here is my training data where the first column is the class (Either 1 or 0) and the next seven columns are the features i am trying to regress on.

Personally, I like more to express the weight update as: weightVector = weightVector - learningRate * gradient. I see more clearly that we move in the opposite direction of the gradient towards a minima of the cost function. — tashuhka

greeness greeness · Accepted Answer · 2014-10-17T08:47:34

Your gradient has the wrong sign:

gradient = learningRate .* (trueClass(m) - predictedClass) .* transpose([1.0 features(m,:)])

It should be:

gradient = learningRate .* (predictedClass - trueClass(m)) .* transpose([1.0 features(m,:)])

See Andrew Ng's note for details. The gradient with respect to the j-th parameter is obtained as below: (where h(x) is the logistic function; y is the true label; x is the feature vector.) enter image description here

Otherwise, when you take the negative of gradient you are doing gradient ascend. I believe that 's why you eventually get infinite cost since it's dead loop and you never get out of it.

The update rule should still be:

weightVector = weightVector - gradient

Stochastic Gradient Descent for Logistic Regression always returns a cost of Inf and weight vector never gets any closer

1 Answers