I recently got introduced to the magical world of neural networks. I started following the Neural Networks and Deep Learning which implement a NN to recognize handwritten digits. It implements a 3 layer network( 1 input, 1 hidden, and 1 output) and trained using MNIST data set.
I just found that the weights matrix of two NN having a similar layer [784,30,10] architecture and trained using the same data set is very different. The same is true for the bias matrix.
General intuition says that since we are using multiple epochs and randomizing the data at each epoch, the weight matrix of both NN should converge to similar values. But it turns out to be very different. Would could be the reason for the same?
Here is the first few weights of NN1:
[array([[-1.2129184 , -0.08418661, -1.58413842, ..., 0.14350188,
1.49436597, -1.71864906],
[ 0.25485346, -0.1795214 , 0.14175609, ..., 0.4222159 ,
1.28005992, -1.17403326],
[ 1.09796094, 0.66119858, 1.12603969, ..., 0.23220572,
-1.66863656, 0.02761243],.....
Here is the first few weights of NN2, having same number of layers and trained using same training data, epochs and eta.
[array([[-0.87264811, 0.34475347, -0.04876076, ..., -0.074056 ,
0.10218085, -0.50177084],
[-1.96657944, -0.35619652, 1.10898861, ..., -0.53325862,
-1.52680967, 0.26800431],
[-1.24731848, 0.13278103, -1.70306514, ..., 0.07964225,
-0.88724451, -0.40311485],
...,