1
votes

I have been trying to get the following neural network working to act as a simple AND gate but it does not seem to be working. The following is my code:

import numpy as np

def sigmoid(x,derivative=False):
    if(derivative==True):
        return x*(1-x)
    return 1/(1+np.exp(-x))

np.random.seed(1)

weights = np.array([0,0,0])

training = np.array([[[1,1,1],1],
                    [[1,1,0],0],
                    [[1,0,1],0],
                    [[1,0,0],0]])

for iter in xrange(training.shape[0]):
#forwardPropagation:
        a_layer1 = training[iter][0]
        z_layer2 = np.dot(weights,a_layer1)
        a_layer2 = sigmoid(z_layer2)
        hypothesis_theta = a_layer2

#backPropagation:
        delta_neuron1_layer2 = a_layer2 - training[iter][1]
        Delta_neuron1_layer2 = np.dot(a_layer2,delta_neuron1_layer2)
        update = Delta_neuron1_layer2/training.shape[0]
        weights = weights-update

x = np.array([1,0,1])

print weights
print sigmoid(np.dot(weights,x))

The program above keeps returning strange values as output, with the input X returning a higher value than the array [1,1,1]. The first element of each of the training/testing 'inputs' represents the bias unit. The code was based off of Andrew Ng's videos on his Coursera course on Machine Learning: https://www.coursera.org/learn/machine-learning

Thanks in advance for your assistance.

1
I can bet anything your array shapes are causing unwanted broadcasted operations to take place, causing the learning to get all screwed up. - cs95

1 Answers

1
votes

A few pointers:

  1. NN's need a LOT of data. You cannot pass it a handful of samples and expect it to learn much.
  2. You are working with lists and 1D arrays instead of 2D arrays. This is dangerous with numpy because it will blindly broadcast wherever no shape is assumed, which may be dangerous in some instances.
  3. You are not using the sigmoid derivative in your backpropagation like you should

I've reshaped your arrays, and also increased your input.

import numpy as np

def sigmoid(x,derivative=False):
    if(derivative==True):
        return x*(1-x)
    return 1/(1+np.exp(-x))

np.random.seed(1)

weights = np.random.randn(1, 3)

training = np.array([[np.array([0, 0, 0]).reshape(1, -1), 1],
                    [np.array([0,0,1]).reshape(1, -1), 0],
                    [np.array([0,1,0]).reshape(1, -1), 0],
                    [np.array([0,1,1]).reshape(1, -1), 0],
                    [np.array([1, 0, 0]).reshape(1, -1), 1],
                    [np.array([1,0, 1]).reshape(1, -1), 0],
                    [np.array([1,1,0]).reshape(1, -1), 0],
                    [np.array([1,1,1]).reshape(1, -1), 1],

                    ])

for iter in xrange(training.shape[0]):
#forwardPropagation:
        a_layer1 = training[iter][0]
        z_layer2 = np.dot(weights,a_layer1.reshape(-1, 1))
        a_layer2 = sigmoid(z_layer2)
        hypothesis_theta = a_layer2

#backPropagation:
        delta_neuron1_layer2 =  (a_layer2 - training[iter][1] ) * sigmoid(a_layer2 , derivative=True)
        Delta_neuron1_layer2 = np.dot(delta_neuron1_layer2 , a_layer1)
        update = Delta_neuron1_layer2
        weights = weights - update 


x = np.array([0,0, 1])
print sigmoid(np.dot(weights,x.reshape(-1, 1)))

x = np.array([0,1,1])
print sigmoid(np.dot(weights,x.reshape(-1, 1)))

x = np.array([1,1,1])
print sigmoid(np.dot(weights,x.reshape(-1, 1))) 

Output:

[[ 0.34224604]]
[[ 0.19976054]]
[[ 0.52710321]]

It's not clean, and there's certainly room for improvement. But at least, you've got something now. Inputs which are expected to produce theoretical 0 are closer to 0 than the input which is supposed to produce theoretical 1.