I've been following Andrew Ng CSC229 machine learning course, and am now covering logistic regression. The goal is to maximize the log likelihood function and find the optimal values of theta to do so. The link to the lecture notes is: [http://cs229.stanford.edu/notes/cs229-notes1.ps][1] -pages 16-19. Now the code below was shown on the course homepage (in matlab though--I converted it to python).
I'm applying it to a data set with 100 training examples (a data set given on the Coursera homepage for a introductory machine learning course). The data has two features which are two scores on two exams. The output is a 1 if the student received admission and 0 is the student did not receive admission. The have shown all of the code below. The following code causes the likelihood function to converge to maximum of about -62. The corresponding values of theta are [-0.05560301 0.01081111 0.00088362]. Using these values when I test out a training example like [1, 30.28671077, 43.89499752] which should give a value of 0 as output, I obtain 0.576 which makes no sense to me. If I test the hypothesis function with input [1, 10, 10] I obtain 0.515 which once again makes no sense. These values should correspond to a lower probability. This has me quite confused.
import numpy as np
import sig as s
def batchlogreg(X, y):
max_iterations = 800
alpha = 0.00001
(m,n) = np.shape(X)
X = np.insert(X, 0, 1, 1)
theta = np.array([0] * (n+1), 'float')
ll = np.array([0] * max_iterations, 'float')
for i in range(max_iterations):
hx = s.sigmoid(np.dot(X, theta))
d = y - hx
theta = theta + alpha*np.dot(np.transpose(X),d)
ll[i] = sum(y * np.log(hx) + (1-y) * np.log(1- hx))
return (theta, ll)
alpha)? It's usually not that small, so you might not be training your model properly. Try0.1, 0.001and so on. - IVlad