I have tried to implement gradient descent and it was working properly when I tested it on sample dataset but it's not working properly for boston dataset.
Can you verify what's wrong with the code. why I'm not getting a correct theta vector?
import numpy as np
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
X = load_boston().data
y = load_boston().target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
X_train1 = np.c_[np.ones((len(X_train), 1)), X_train]
X_test1 = np.c_[np.ones((len(X_test), 1)), X_test]
eta = 0.0001
n_iterations = 100
m = len(X_train1)
tol = 0.00001
theta = np.random.randn(14, 1)
for i in range(n_iterations):
gradients = 2/m * X_train1.T.dot(X_train1.dot(theta) - y_train)
if np.linalg.norm(X_train1) < tol:
break
theta = theta - (eta * gradients)
I'm getting my weight vector in the shape of (14, 354). What am I doing wrong here?