0
votes

I've implemented the following neural network to solve the XOR problem in Python. My neural network consists of an input layer of 3 neurons, 1 hidden layer of 2 neurons and an output layer of 1 neuron. I am using the Sigmoid function as the activation function for the hidden layer and output layer:

import numpy as np

x = np.array([[0,0,1], [0,1,1],[1,0,1],[1,1,1]])
y = np.array([[0,1,1,0]]).T


np.random.seed(1)
weights1 = np.random.random((3,2)) - 1
weights2 = np.random.random((2,1)) - 1


def nonlin(x,deriv=False):
    if(deriv==True):
       return x*(1-x)

return 1/(1+np.exp(-x))

for iter in xrange(10000):
    z2 = np.dot(x,weights1)

    a2 = nonlin(z2)

    z3 = np.dot(a2,weights2)

    a3 = nonlin(z3)

    error = y- a3

    delta3 = error * nonlin(z3,deriv=True)
    l1error  = delta3.dot(weights2.T)
    delta2 = l1error *nonlin(z2, deriv=True)

    weights2 += np.dot(a2.T, delta3)
    weights1 += np.dot(x.T,delta2)



print(a3)

The backpropogation seems to be correct but i keep getting this error and all the values become 'nan', OUTPUT:

RuntimeWarning: overflow encountered in exp
return 1/(1+np.exp(-x))

RuntimeWarning: overflow encountered in multiply
return x*(1-x)
[[ nan]
[ nan]
[ nan]
[ nan]]

Could you please help me with this problem? Thank you.

1
Debug to find out where NaN is being introduced. Once NaN is used in a math equation, everything else involved becomes NaN, and it contaminates your results. Look up what causes NaN to appear in Python maths, and find the point where you're causing it to appear.Carcigenicate
Oh, just read it again, and it tells you exactly where it's happening already. Your exponentiation is overflowing. Your exponent is probably too huge. Check the value of x when the overflow happens. It's probably give for some reason.Carcigenicate
You do not need that many neurons to solve XOR. It can be solved with 2 inputs, 2 neurons in the hidden layer and one in the output layerSumNeuron
@SumNeuron I have used 2 neurons in the hidden and one in the output like you saidvardaan
@vardaan you also have an "input layer of 3 neuron"SumNeuron

1 Answers

0
votes

You have some issue with weights exploding:

weight1 =   [[ -6.25293101e+194  -2.22527234e+000]
             [  2.24755436e+193  -2.44789058e+000]
             [ -2.40600808e+194  -1.62490517e+000]]

This happened because when you calculate the delta error for back-propagation, you used the output of the dot product and not the output of the activation function.

correction to your code:

 delta3  = error * nonlin(a3,deriv=True)
 l1error = delta3.dot(weights2.T)
 delta2  = l1error *nonlin(a2, deriv=True)