Here's the Jupyter Notebook I used for this practice: https://drive.google.com/file/d/18-OXyvXSit5x0ftiW9bhcqJrO_SE22_S/view?usp=sharing
I was practicing simple Linear Regression with this data set, and here are my parameters:
sat = np.array(data['SAT'])
gpa = np.array(data['GPA'])
theta_0 = 0.01
theta_1 = 0.01
alpha = 0.003
cost = 0
m = len(gpa)
I tried to optimize the cost function calculation by turning it into a matrix and perform element wise operations. This is the resulting formula I came up with:
Cost function optimization:
Cost function
def calculateCost(matrix_x,matrix_y,m):
global theta_0,theta_1
cost = (1 / (2 * m)) * ((theta_0 + (theta_1 * matrix_x) - matrix_y) ** 2).sum()
return cost
I also tried to do the same for the gradient descent.
Gradient Descent
def gradDescent(alpha,matrix_x,matrix_y):
global theta_0,theta_1,m,cost
cost = calculateCost(sat,gpa,m)
while cost > 1
temp_0 = theta_0 - alpha * (1 / m) * (theta_0 + theta_1 * matrix_x - matrix_y).sum()
temp_1 = theta_1 - alpha * (1 / m) * (matrix_x.transpose() * (theta_0 + theta_1 * matrix_x - matrix_y)).sum()
theta_0 = temp_0
theta_1 = temp_1
I am not entirely sure whether both implementations are correct. The implementation returned a cost of 114.89379821428574 and somehow this is how the "descent" looked like when I graph the costs:
Gradient descent graph:
Please correct me if I have implemented both the cost function and gradient descent correctly, and provide explanation if possible as I am still a beginner in multivariable calculus. Thank you.