I am trying to implement logistic regression from scratch. Here I got confused that initially we got a single random value for the weight. However, as the process goes. I found that the final result of the training gives multiple weights (matches to the number of data points in the training set). I completely got no Idea here since the predictions are working properly yet I think it does not make any sense to have multiple weights for a single feature. I've also mentioned my problems in the code below.
np.random.seed(100)
class LogisticRegression:
def sigmoid(self, z): return 1 / (1 + np.e**(-z))
def cost_function(self, X, y, weights):
z = X*weights
predict_1 = y * np.log(self.sigmoid(z))
predict_0 = (1 - y) * np.log(1 - self.sigmoid(z))
#print(-sum(predict_1 + predict_0) / len(X))
return -sum(predict_1 + predict_0) / len(X)
def fit(self, X, y, epochs=250, lr=0.05):
loss = []
weights = np.rand() # Initially weights here is a single number...
N = len(X)
for _ in range(epochs):
# Gradient Descent
y_hat = self.sigmoid(X*weights)
weights -= lr * X*(y_hat - y) / N # ...But then the number of weights
# become equal to the number of
# data points at this line...
# Saving Progress
loss.append(self.cost_function(X, y, weights))
self.weights = weights
self.loss = loss
print('weights:', weights) # ...Which causes us to get different
# weight for each data points.
# How can I plot the final logistic curve then
# if I got multiple final weights?
#print(loss)
def predict(self, X):
# Predicting with sigmoid function
z = X*self.weights
# Returning binary result
return [1 if i > 0.5 else 0 for i in self.sigmoid(z)]
#print(self.sigmoid(z))
clf = LogisticRegression()
clf.fit(X,y)
clf.predict(X)