I got a problem in understending the difference between MLP and SLP.
I know that in the first case the MLP has more than one layer (the hidden layers) and that the neurons got a non linear activation function, like the logistic function (needed for the gradient descent). But I have read that:
"if all neurons in an MLP had a linear activation function, the MLP could be replaced by a single layer of perceptrons, which can only solve linearly separable problems"
I don't understand why in the specific case of the XOR, which is not linearly separable, the equivalent MLP is a two layer network, that for every neurons got a linear activation function, like the step function. I understand that I need two line for the separation, but in this case I cannot apply the rule of the previous statment (the replacement of the MLP with the SLP).
Mlp for xor:
http://s17.postimg.org/c7hwv0s8f/xor.png
In the linked image the neurons A B and C have a linear activation function (like the step function)