Why does my XOR neural net converge to 0.5, python

Question

I've implemented the following neural network to solve the XOR problem in Python. My neural network consists of an input layer of 2 neurons, 1 hidden layer of 2 neurons and an output layer of 1 neuron. I am using the Sigmoid function as the activation function for both the hidden layer and output layer. Can someone please explain what I have done wrong.

import numpy
import scipy.special

class NeuralNetwork:
    def __init__(self, inputNodes, hiddenNodes, outputNodes, learningRate):
        self.iNodes = inputNodes
        self.hNodes = hiddenNodes
        self.oNodes = outputNodes

        self.wIH = numpy.random.normal(0.0, pow(self.iNodes, -0.5), (self.hNodes, self.iNodes))
        self.wOH = numpy.random.normal(0.0, pow(self.hNodes, -0.5), (self.oNodes, self.hNodes))

        self.lr = learningRate

        self.activationFunction = lambda x: scipy.special.expit(x)

    def train(self, inputList, targetList):
        inputs = numpy.array(inputList, ndmin=2).T
        targets = numpy.array(targetList, ndmin=2).T

        #print(inputs, targets)
        hiddenInputs = numpy.dot(self.wIH, inputs)
        hiddenOutputs = self.activationFunction(hiddenInputs)

        finalInputs = numpy.dot(self.wOH, hiddenOutputs)
        finalOutputs = self.activationFunction(finalInputs)

        outputErrors = targets - finalOutputs

        hiddenErrors = numpy.dot(self.wOH.T, outputErrors)

        self.wOH += self.lr * numpy.dot((outputErrors * finalOutputs * (1.0 - finalOutputs)), numpy.transpose(hiddenOutputs))
        self.wIH += self.lr * numpy.dot((hiddenErrors * hiddenOutputs * (1.0 - hiddenOutputs)), numpy.transpose(inputs))

    def query(self, inputList):
        inputs = numpy.array(inputList, ndmin=2).T

        hiddenInputs = numpy.dot(self.wIH, inputs)
        hiddenOutputs = self.activationFunction(hiddenInputs)

        finalInputs = numpy.dot(self.wOH, hiddenOutputs)
        finalOutputs = self.activationFunction(finalInputs)

        return finalOutputs



nn = NeuralNetwork(2, 2, 1, 0.01)

data = [[0, 0, 0], [0, 1, 1], [1, 0, 1], [1, 1, 0]]
epochs = 10

for e in range(epochs):
    for record in data:
        inputs = numpy.asfarray(record[1:])
        targets = record[0]
        #print(targets)
        #print(inputs, targets)
        nn.train(inputs, targets)


print(nn.query([0, 0]))
print(nn.query([1, 0]))
print(nn.query([0, 1]))
print(nn.query([1, 1]))

paddyg paddyg · Accepted Answer · 2017-08-22T20:04:13

Several reasons.

I don't think you should be taking the activation function of everything, especially in your query function. I think you have muddled up the ideas of neuron to neuron weightings (wIH and wOH) with the activation values.
Because of your muddle you have missed the idea of re-using your query function as part of your training. You should think of it as feed forward activation levels to the output, compare the result with the target output to give an array of errors which are then fed backwards using the derivative of the sigmoid function to adjust the weightings.

I would put the function and it's derivative in rather than importing from scipy as they are so simple. Also "it's recommended" to use tanh and d/dx.tanh for the hidden layer functions (can't remember why, probably not needed for this simple net)

# transfer functions
def sigmoid(x):
  return 1 / (1 + np.exp(-x))

# derivative of sigmoid
def dsigmoid(y):
  return y * (1.0 - y)

# using tanh over logistic sigmoid for the hidden layer is recommended  
def tanh(x):
  return np.tanh(x)

# derivative for tanh sigmoid
def dtanh(y):
  return 1 - y*y

Finally, you might be able to figure out what I did a while ago with a neural net using just numpy here https://github.com/paddywwoof/Machine-Learning/blob/master/perceptron.py

Why does my XOR neural net converge to 0.5, python

1 Answers