From the Wikipedia:
If a multilayer perceptron has a linear activation function in all neurons, that is, a linear function that maps the weighted inputs to the output of each neuron, then it is easily proved with linear algebra that any number of layers can be reduced to the standard two-layer input-output model (see perceptron).
I have seen Multilayer Perceptron replaced with Single Layer Perceptron and what I understood is that this is because combination of linear functions can be expressed with a linear function and this is the only reason, am I right?
So how does reduction process look like? i.e. if we had 3x5x2 MLP, how would SLP look like? Is size of input layer based on the number of parameters used to express linear function like in the answer of link above?:
f(x) = a x + b
g(z) = c z + d
g(f(x)) = c (a x + b) + d = ac x + cb + d = (ac) x + (cb + d)
so it would be 4 inputs? (a, b, c, d since it is combination of two linear functions with different parameters)
Thanks in advance!