3
votes

I'm new to neural networks. I've been trying to implement a two layer network to learn the XOR function using the backpropagation algorithm. The hidden layer has 2 units and the output layer is having 1 unit. All units use the sigmoid activation function.

I'm initializing the weights between -1 to +1 and have a +1 bias.

The problem is that the network learns the function very small number of times when re-initialized from scratch using some other random values of weights. It's learning the other boolean functions (AND, OR) in very small number of iterations, almost every time for almost all randomly assigned weights. The problem is with the XOR function - it doesn't converge to optimal values for some values of random weights.

I'm using stochastic gradient descent, backpropagation for learning.

My code : http://ideone.com/IYPW2N

Hidden layer's compute output function code:

public double computeOutput(double[] input){
    this.input = input;
    output = bias+w[0]*input[0] + w[1]*input[1];
    output = 1/(1+Math.pow(Math.E, -output));    
    return output;
}

Compute Error Function code:

public double computeError(double w, double outputUnitError){
    error = (output)*(1-output)*(outputUnitError*w);  
    return error;        
}

Fixing errors for hidden units :

public void fixError(){  
    for(int i=0;i<input.length;++i) w[i] += n*error*input[i]; 
}

Output unit's compute output function:

public void computeOutput(double[] input) { 
    this.input = input;  
    output = bias+input[0]*w[0]+input[1]*w[1];
    output = 1/(1+Math.pow(Math.E, -output));   
} 

Output unit's computeError function:

public void computeError(double t){ 
    this.t = t; 
    error = (output)*(1-output)*(t-output); 
}

Output Unit's fixError function (updating the weights):

public void fixError() {     
    for(int i=0;i<w.length;++i) w[i] += n*error*input[i]; 
}   

The code stops training as soon as in any iteration, all examples are classified correctly. It stops otherwise when number of iterations exceeds 90k.

The learning rate is set to 0.05. If the output unit's value is greater than 0.5, it's counted as +1, otherwise a 0.

Training Examples:

static Example[] examples = {
        new Example(new double[]{0, 0}, 0),
        new Example(new double[]{0,  1}, 1),
        new Example(new double[]{1, 0}, 1),
        new Example(new double[]{1,  1}, 0)
};

Output from code:

Iterations > 90000, stop...
Displaying outputs for all examples... 
0.018861254512881773
0.7270271284494716
0.5007550527204925
0.5024353957353963

Training complete. No of iterations = 45076
Displaying outputs for all examples... 
0.3944511789979849
0.5033004761575361
0.5008283246200929
0.2865272493546562

Training complete. No of iterations = 39707
Displaying outputs for all examples... 
0.39455754434259843
0.5008762488126696
0.5029579167912538
0.28715696580224176

Iterations > 90000, stop...
Displaying outputs for all examples... 
0.43116164638530535
0.32096730276984053
0.9758219334403757
0.32228953888593287

I've tried several values for the learning rate and increased the number of hidden layer units, but still it's not learning XOR.

Please correct me where I'm getting it wrong or if there's a bug in the implementation.

I checked out other threads but did not get any satisfactory solution to my problem.

1

1 Answers

0
votes

You are supposed to learn bias too, while your code assumes that bias is constant (you should have weight connected to bias). Without bias, you will not be able to learn XOR.