
I have trained xor neural network in Matlab and got these weights:

iw: [-2.162 2.1706; 2.1565 -2.1688]

lw: [-3.9174 -3.9183]

b{1} [2.001; 2.0033]

b{2} [3.8093]

Just from curiosity I have tried to write MATLAB code which computes the output of this network (2 neurons in hidden layer, and 1 in output, TANSIG activation function).

Code that I got:

l1w = [-2.162 2.1706; 2.1565 -2.1688];
l2w = [-3.9174 -3.9183];
b1w = [2.001 2.0033];
b2w = [3.8093];

input = [1, 0];

out1 = tansig (input(1)*l1w(1,1) + input(2)*l1w(1,2) + b1w(1));
out2 = tansig (input(1)*l1w(2,1) + input(2)*l1w(2,2) + b1w(2));
out3 = tansig (out1*l2w(1) + out2*l2w(2) + b2w(1))

The problem is when input is lets say [1,1], it outputs -0.9989, when [0,1] 0.4902. While simulating network generated with MATLAB outputs adequately are 0.00055875 and 0.99943.

What I'm doing wrong?

why dont you post the actual code you used to build and train the network?Amro

2 Answers


I wrote a simple example of an XOR network. I used newpr, which defaults to tansig transfer function for both hidden and output layers.

input = [0 0 1 1; 0 1 0 1];               %# each column is an input vector
ouputActual = [0 1 1 0];

net = newpr(input, ouputActual, 2);       %# 1 hidden layer with 2 neurons
net.divideFcn = '';                       %# use the entire input for training

net = init(net);                          %# initialize net
net = train(net, input, ouputActual);     %# train
outputPredicted = sim(net, input);        %# predict

then we check the result by computing the output ourselves. The important thing to remember is that by default, inputs/outputs are scaled to the [-1,1] range:

scaledIn = (2*input - 1);           %# from [0,1] to [-1,1]
for i=1:size(input,2)
    in = scaledIn(:,i);             %# i-th input vector
    hidden(1) = tansig( net.IW{1}(1,1)*in(1) + net.IW{1}(1,2)*in(2) + net.b{1}(1) );
    hidden(2) = tansig( net.IW{1}(2,1)*in(1) + net.IW{1}(2,2)*in(2) + net.b{1}(2) );
    out(i) = tansig( hidden(1)*net.LW{2,1}(1) + hidden(2)*net.LW{2,1}(2) + net.b{2} );
scaledOut = (out+1)/2;              %# from [-1,1] to [0,1]

or more efficiently expressed as matrix product in one line:

scaledIn = (2*input - 1);           %# from [0,1] to [-1,1]
out = tansig( net.LW{2,1} * tansig( net.IW{1}*scaledIn + repmat(net.b{1},1,size(input,2)) ) + repmat(net.b{2},1,size(input,2)) );
scaledOut = (1 + out)/2;            %# from [-1,1] to [0,1]

You usually don't use a sigmoid on your output layer--are you sure you should have the tansig on out3? And are you sure you are looking at the weights of the appropriately trained network? It looks like you've got a network trained to do XOR on [1,1] [1,-1] [-1,1] and [-1,-1], with +1 meaning "xor" and -1 meaning "same".