1
votes

I am working on a simple script that tries to find values for my hypothesis. I am using for one a gradient descent and the second the normal equation. The normal equation is giving me the proper results, but my gradient descent not. I can't figure it out with such a simple case why is not working.

Hi, I am trying to understand why my gradient descend does not match the normal equation on linear regression. I am using matlab to implement both. Here's what I tried:

So I created a dummy training set as such:

x = {1 2 3}, y = {2 3 4}

so my hypothesis should converge to the theta = {1 1} so I get a simple

h(x) = 1 + x;

Here's the test code comparing normal equation and gradient descent:

clear;
disp("gradient descend");
X = [1; 2; 3];
y = [2; 3; 4];
theta = [0 0];
num_iters = 10;
alpha = 0.3;
thetaOut = gradientDescent(X, y, theta, 0.3, 10); % GD -> does not work, why?
disp(thetaOut);

clear;
disp("normal equation");
X = [1 1; 1 2; 1 3];
y = [2;3;4];
Xt = transpose(X);
theta = pinv(Xt*X)*Xt*y; % normal equation -> works!
disp(theta);

And here is the inner loop of the gradient descent:

samples = length(y);
for epoch = 1:iterations

     hipoth = X * theta;
     factor = alpha * (1/samples);
     theta = theta - factor * ((hipoth - y)' * X )';
     %disp(epoch);

end

and the output after 10 iterations:

gradient descend = 1.4284 1.4284 - > wrong
normal equation = 1.0000 1.0000 -> correct

does not make sense, it should converge to 1,1.

any ideas? Do I have matlab syntax problem?

thank you!

2
What you posted does not work, please post your real code. hipoth=X*theta is 3x2 and y is 3x1, so there's the first error. Also Matlab strings (in disp) use single quotes'.avermaet
@avermaet MATLAB "strings" are not the same as MATLAB 'char arrays', double quotes have been valid syntax since R2016b. Your X*theta note is more valid, but be aware that implicit expansion has also been valid syntax since R2016b, so it might be that the OP just intended to use the element-wise multiplier .*.Wolfie
@Wolfie Thanks for the clarification, I was not aware of that. Totally my mistake.avermaet

2 Answers

0
votes

Gradient Descend can solve a lot of different problems. You want to do a linear regression, i.e. find a linear function h(x) = theta_1 * X + theta_2 that best fits your data:

h(X) = Y + error

What the "best" fit is, is debatable. The most common way to define best fit is to minimize the square of the errors between fit and actual data. Assuming that is what you want ...

Replace the function with

function [theta] = gradientDescent(X, Y, theta, alpha, num_iters)
n = length(Y);
    for epoch = 1:num_iters

        Y_pred = theta(1)*X + theta(2);
        D_t1 = (-2/n) * X' * (Y - Y_pred);
        D_t2 = (-2/n) * sum(Y - Y_pred);
        theta(1) = theta(1) - alpha * D_t1;
        theta(2) = theta(2) - alpha * D_t2;

    end
end

and change your parameters a bit, e.g.

num_iters = 10000;
alpha = 0.05;

you get the correct answer. I took the code snippet from here which might also provide a nice starting point to read up on what is actually happening here.

0
votes

Your gradient descend is solving a different thing than the normal equation, you are not inputing the same data. On top of that you seem to overcomplicate a but the theta update, but that is not a problem. Minor changes in your code result in proper output:

function theta=gradientDescent(X,y,theta,alpha,iterations)

samples = length(y);
for epoch = 1:iterations

     hipoth = X * theta;
     factor = alpha * (1/samples);
     theta = theta - factor * X'*(hipoth - y);
     %disp(epoch);

end
end

and the main code:

clear;
X = [1 1; 1 2; 1 3];
y = [2;3;4];
theta = [0 0];
num_iters = 10;
alpha = 0.3;
thetaOut = gradientDescent(X, y, theta', 0.3, 600); % Iterate a bit more, you impatient person!

theta = pinv(X.'*X)*X.'*y; % normal equation -> works!


disp("gradient descend");
disp(thetaOut);
disp("normal equation");
disp(theta);