0
votes

I am trying to implement a Logistic Regression algorithm without calling any function that matlab supports and afterwords i am calling the matlab function for logistic regression mnrfit so i can cross-confirm that my algorithm is working well.

The process that i am implementing is as follows. i first make a vector x that has the input data and a vector y [0,1] that has the corresponding class for every data x. I implement Linear regression using gradient descent to these data and once i extract the coefficients i am passing the line through the sigmoid function. Later on i make a prediction for x=10 to find the likelihood for class 1 for this input. Simple as that..

After that i am calling matlab function mnrfit and extract the coefficients for logistic regression. To make the same prediction i call the function mnrval with an argument of 10, since i want to predict for input x=10 as before. My results are different and i do not know why..

The 2 plots that are extracted showing the probability density function for each case are shown at the end.

I also attach the code of the implementation.

% x is the continues input and y is the category of every output [1 or 0]
x = (1:100)';   % independent variables x(s)
y(1:10)  = 0;    % Dependent variables y(s) -- class 0
y(11:100) = 1;    % Dependent variables y(s) -- class 1
y=y';
y = y(randperm(length(y))); % Random order of y array
x=[ones(length(x),1) x]; % This is done for vectorized code

%% Initialize Linear regression parameters

m = length(y); % number of training examples
% initialize fitting parameters - all zeros
Alpha = 0; % gradient
Beta = 0;  % offset
% Some gradient descent settings
% iterations must be a big number because we are taking very small steps .
iterations = 100000;
% Learning step must be small because the line must fit the data between 
% [0 and 1]
Learning_step_a = 0.0005;  % step parameter

%% Run Gradient descent 

fprintf('Running Gradient Descent ...\n')
for iter = 1:iterations
% In every iteration calculate objective function 
h= Alpha.*x(:,2)+ Beta.*x(:,1);
% Update line variables
Alpha=Alpha - Learning_step_a * (1/m)* sum((h-y).* x(:,2));
Beta=Beta - Learning_step_a * (1/m) *  sum((h-y).*x(:,1)); 
end

% This is my linear Model
LinearModel=Alpha.*x(:,2)+ Beta.*x(:,1);
% I pass it through a sigmoid !
LogisticRegressionPDF = 1 ./ (1 + exp(-LinearModel));
% Make a prediction for p(y==1|x==10)
Prediction1=LogisticRegressionPDF(10);

%% Confirmation with matlab function mnrfit

B=mnrfit(x(:,2),y+1); % Find Logistic Regression Coefficients
mnrvalPDF = mnrval(B,x(:,2));
% Make a prediction .. p(y==1|x==10)
Prediction2=mnrvalPDF(10,2);

%% Plotting Results 

% Plot Logistic Regression Results ...
figure;
plot(x(:,2),y,'g*');
hold on
plot(x(:,2),LogisticRegressionPDF,'k--');
hold off
title('My Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function');

% Plot Logistic Regression Results (mnrfit) ...      
figure,plot(x(:,2),y,'g*');
hold on   
plot(x(:,2),mnrvalPDF(:,2),'--k') 
hold off   
title('mnrval Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function') 

Why my plots (as long as the predictions) for each case are not the same??

  • the output that you may extract will be different at every execution because the order of ones and zeros in y vector is random.

enter image description here

2
Any suggestions please?Konstantinos Monachopoulos
Your comment did not notify anyone. You could have commented on my answer to let me know you edited the question; as is, I was unaware of the edit until now. You ask why the plots are different. But logistic regression is not the same thing as linear regression composed with sigmoid. Mathematically, there is no reason to expect the same result from these two procedures.user3717023
Yes i see that my comment did not notify anyone, i am surprised because it is a simple question. If i am wrong in my question, again someone must tell me that i am asking something that is non understandable. Anyway, can you explain to me why it is not the same (Logistic Regression vs Linear through sigmoid)? From the examples in internet that is what i understand. How can i proper implement logistic regression through a non vectorized matlab code?Konstantinos Monachopoulos

2 Answers

1
votes

I developed my own logistic regression algorithm using the gradient descent method. For "good" training data, my algorithm had no choice but to converge on the same solution as mnrfit. For "less good" training data, my algorithm did not close with mnrfit. The coefficients and the associated model could predict the outcome well enough, but not as well as mnrfit. Plotting the residuals revealed that mnrfit's residuals were nearly zero 9x10 -200 versus mine which were close to zero (0.00001). I tried changing the alpha, number of steps and initial theta guess but doing so only produced different theta results. When I tweaked these parameters with a good data set, my theta started to converge better with mnrfit.

1
votes

Thanks a lot for the information user3779062 . Inside the PDF file is all i wanted. I had already implemented stochastic gradient descent, so the only difference i had to make, to implement Logistic Regression, is to update the hypothesis function through a sigmoid function in for loops and change the order as long as the sign in the update thetas rule. The Results are the same as mnrval. i implement the code for a lot of examples and the results are the same most of the times (especially if the data set is good and has a lot of information in both classes). I attach the final code and a random result of the results set.

% Machine Learning : Logistic Regression

% Logistic regression is working as linear regression but as an output
% specifies the propability to be attached to one category or the other.
% At the beginning we created a well defined data set that can be easily
% be fitted by a sigmoid function.

clear all; close all; clc;

% This example runs many times to compare a lot of results
for examples=1:10:100
clearvars -except examples

%%  Creatte Training Data 

% x is the continues input and y is the category of every output [1 or 0]
x = (1:100)';   % independent variables x(s)
y(1:examples)  = 0;    % Dependent variables y(s) -- class 0
y(examples+1:100) = 1;    % Dependent variables y(s) -- class 1
y=y';
y = y(randperm(length(y))); % Random order of y array
x=[ones(length(x),1) x]; % This is done for vectorized code

%% Initialize Linear regression parameters

m = length(y); % number of training examples
% initialize fitting parameters - all zeros
Alpha = 0; % gradient
Beta = 0;  % offset
% Some gradient descent settings
% iterations must be a big number because we are taking very small steps .
iterations = 100000;
% Learning step must be small because the line must fit the data between 
% [0 and 1]
Learning_step_a = 0.0005;  % step parameter

%% Run Gradient descent 

fprintf('Running Gradient Descent ...\n')
for iter = 1:iterations

% Linear hypothesis function 
h= Alpha.*x(:,2)+ Beta.*x(:,1);

% Non - Linear hypothesis function
hx = 1 ./ (1 + exp(-h));

% Update coefficients
Alpha=Alpha + Learning_step_a * (1/m)* sum((y-hx).* x(:,2));
Beta=Beta + Learning_step_a * (1/m) *  sum((y-hx).*x(:,1));

end

% Make a prediction for p(y==1|x==10)
Prediction1=hx(10)

%% Confirmation with matlab function mnrfit

B=mnrfit(x(:,2),y+1); % Find Logistic Regression Coefficients
mnrvalPDF = mnrval(B,x(:,2));
% Make a prediction .. p(y==1|x==10)
Prediction2=mnrvalPDF(10,2)

%% Plotting Results 

% Plot Logistic Regression Results ...
figure;
subplot(1,2,1),plot(x(:,2),y,'g*');
hold on
subplot(1,2,1),plot(x(:,2),hx,'k--');
hold off
title('My Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function');

% Plot Logistic Regression Results (mnrfit) ...      
subplot(1,2,2),plot(x(:,2),y,'g*');
hold on   
subplot(1,2,2),plot(x(:,2),mnrvalPDF(:,2),'--k') 
hold off   
title('mnrval Logistic Regression PDF')
xlabel('continues input');
ylabel('propability density function')    
end

Results..

enter image description here Thanks a lot!!