Neural Network (No hidden layers) vs Logistic Regression?

Question

I've been taking a class on neural networks and don't really understand why I get different results from the accuracy score from logistic regression, and a two layer neural network (input layer and output layer). The output layer is using the sigmoid activation function. From what I learned we can use the sigmoid activation function in neural networks to compute a probability. This should be very similar if not identical to what logistic regression tries to accomplish. Then from there backpropogate to minimize the error using gradient descent. There is probably an easy explanation, but I don't understand why the accuracy scores vary so much. In this example, I am not using any training or test sets, just simple data to demonstrate what I don't understand.

I am getting 71.4% accuracy for logistic regression. In the examples below I just created numbers for the 'X' and outcomes 'y' array. I purposely made the numbers higher for 'X' when the outcome is equal to '1' so that the linear classifier can have some accuracy.

import numpy as np
from sklearn.linear_model import LogisticRegression
X = np.array([[200, 100], [320, 90], [150, 60], [170, 20], [169, 75], [190, 65], [212, 132]])
y = np.array([[1], [1], [0], [0], [0], [0], [1]])

clf = LogisticRegression()
clf.fit(X,y)
clf.score(X,y) ##This results in a 71.4% accuracy score for logistic regression

However when I implement a neural network, with no hidden layers, just using the sigmoid activation function for the single node output layer (so two layers in total, input and output layer). My accuracy score is around 42.9%? Why is this significantly different than the logistic regression accuracy score? And why is this so low?

import keras
from keras.models import Sequential
from keras.utils.np_utils import to_categorical
from keras.layers import Dense, Dropout, Activation

model = Sequential()

#Create a neural network with 2 input nodes for the input layer and one node for the output layer. Using the sigmoid activation function
model.add(Dense(units=1, activation='sigmoid', input_dim=2))
model.summary()
model.compile(loss="binary_crossentropy", optimizer="adam", metrics = ['accuracy'])
model.fit(X,y, epochs=12)

model.evaluate(X,y) #The accuracy score will now show 42.9% for the neural network

Nicole White Nicole White · Accepted Answer · 2017-09-24T06:29:07

You're not comparing the same thing. Sklearn's LogisticRegression sets a lot of defaults that you are not utilizing in your Keras implementation. I actually get the accuracies within 1e-8 of each other when accounting for these differences, the main ones being:

Number of Iterations

In Keras this is epochs passed during fit(). You set it to 12. In Sklearn this is max_iter passed during LogisticRegression's __init__(). It defaults to 100.

Optimizer

You are using the adam optimizer in Keras, whereas LogisticRegression uses the liblinear optimizer by default. Sklearn calls it a solver.

Regularization

Sklearn's LogisticRegression uses L2 regularization by default and you are not doing any weight regularization in Keras. In Sklearn this is the penalty and in Keras you can regularize the weights with each layer's kernel_regularizer.

These implementations both achieve 0.5714% accuracy:

import numpy as np

X = np.array([
  [200, 100], 
  [320, 90], 
  [150, 60], 
  [170, 20], 
  [169, 75], 
  [190, 65], 
  [212, 132]
])
y = np.array([[1], [1], [0], [0], [0], [0], [1]])

Logistic Regression

from sklearn.linear_model import LogisticRegression

# 'sag' is stochastic average gradient descent
lr = LogisticRegression(penalty='l2', solver='sag', max_iter=100)

lr.fit(X, y)
lr.score(X, y)
# 0.5714285714285714

Neural Network

from keras.models import Sequential
from keras.layers import Dense
from keras.regularizers import l2

model = Sequential([
  Dense(units=1, activation='sigmoid', kernel_regularizer=l2(0.), input_shape=(2,))
])

model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit(X, y, epochs=100)
model.evaluate(X, y)
# 0.57142859697341919

Neural Network (No hidden layers) vs Logistic Regression?

1 Answers