I am trying to implement a Convolutional Neural Network in python. The architecture is as follows:INPUT->[Convolution->Sigmoid->Pooling]->[Convolution->Sigmoid->Pooling]->Fully Connected Layer-> Hidden Layer->Ouput
.
input shape: 28*28
Filters/weights shape for COnvolutional layer1: 20*1*5*5
Filters/weights shape for COnvolutional layer2: 40*20*5*5
Activation Function: Sigmoid (1/(1+e^-x))
Due to the large shape of filters/weights, while applying the dot product in COnvolutional Layer 2, the resulting output values are near to 20 or higher which is subsequently resulting in the output after sigmoid activation function values to be all 1's.
Output at COnvolutional layer1:
[ 0.75810452 0.79819809 0.70897314 0.50897858 0.02901152 0.98447587
0.99995668 0.99999814 0.99912627 0.7885211 0.87708188 0.76611807]
...
...
Output at COnvolutional layer2:
[ 19.88641441 20.11005634 20.04984707 20.19106394 19.93096274
20.1585536 19.84757161 19.79030395]
...
...
output after applying sigmoid on convlayer2:
[ 1. 1. 1. 1. 1. 1. 1. 1.]
...
...
[ 1. 1. 1. 0.99999 1. 1. 1. 1.]
I have found a similar question on this forum: Neural Network sigmoid function . I did not commit the mistakes pointed out in the Tim's answer. But what i couldn't figure out was this:
Finally, even with these changes, a fully-connected neural network with all positive weights will probably still produce all 1's for the output. You can either include negative weights corresponding to inhibitory nodes, or reduce connectivity significantly (e.g. with a 0.1 probability that a node in layer n connects to a node in layer n+1).
Should i normalize the output after applying sigmoid on convlayer2? or try something else?
EDIT: Input data:
[[ 3. 0. 0. 3. 7. 3. 0. 3. 0. 11. 0. 0.
3. 0. 0. 3. 8. 0. 0. 3. 0. 0. 0. 2.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 1. 5. 0. 12. 0.
16. 0. 0. 4. 0. 2. 8. 3. 0. 4. 8. 0.
0. 0. 0. 0.]
[ 0. 0. 2. 0. 0. 0. 1. 2. 1. 12. 0. 8.
0. 0. 6. 0. 11. 0. 0. 6. 7. 2. 0. 0.
0. 0. 0. 0.]
[ 0. 1. 3. 0. 0. 2. 3. 0. 0. 0. 12. 0.
0. 23. 0. 0. 0. 0. 11. 3. 0. 0. 4. 0.
0. 0. 0. 0.]
[ 0. 1. 1. 0. 0. 2. 0. 0. 6. 0. 25. 27.
136. 135. 188. 89. 84. 25. 0. 0. 3. 1. 0. 0.
0. 0. 0. 0.]
[ 4. 0. 0. 0. 0. 0. 0. 0. 3. 88. 247. 236.
255. 249. 250. 227. 240. 136. 37. 1. 0. 2. 2. 0.
0. 0. 0. 0.]
[ 2. 0. 0. 3. 0. 0. 4. 27. 193. 251. 253. 255.
255. 255. 255. 240. 254. 255. 213. 89. 0. 0. 14. 1.
0. 0. 0. 0.]
[ 0. 0. 0. 6. 0. 0. 18. 56. 246. 255. 253. 243.
251. 255. 245. 255. 255. 254. 255. 231. 119. 7. 0. 5.
0. 0. 0. 0.]
[ 4. 0. 0. 12. 13. 0. 65. 190. 246. 255. 255. 251.
255. 109. 88. 199. 255. 247. 250. 255. 234. 92. 0. 0.
0. 0. 0. 0.]
[ 0. 10. 1. 0. 0. 18. 163. 248. 255. 235. 216. 150.
128. 45. 6. 8. 22. 212. 255. 255. 252. 172. 0. 15.
0. 0. 0. 0.]
[ 0. 1. 4. 5. 0. 0. 187. 255. 254. 94. 57. 7.
1. 0. 6. 0. 0. 139. 242. 255. 255. 218. 62. 0.
0. 0. 0. 0.]
[ 5. 2. 0. 0. 11. 56. 252. 235. 253. 20. 5. 2.
5. 1. 0. 1. 2. 0. 97. 249. 248. 249. 166. 8.
0. 0. 0. 0.]
[ 0. 0. 2. 0. 0. 70. 255. 255. 245. 25. 10. 0.
0. 1. 0. 4. 10. 0. 10. 255. 246. 250. 155. 0.
0. 0. 0. 0.]
[ 2. 0. 7. 12. 0. 87. 226. 255. 184. 0. 3. 0.
10. 5. 0. 0. 0. 9. 0. 183. 251. 255. 222. 15.
0. 0. 0. 0.]
[ 0. 5. 1. 0. 19. 230. 255. 243. 255. 35. 2. 0.
0. 0. 0. 9. 8. 0. 0. 70. 245. 242. 255. 14.
0. 0. 0. 0.]
[ 0. 4. 3. 0. 19. 251. 239. 255. 247. 30. 1. 0.
4. 4. 14. 0. 0. 2. 0. 47. 255. 255. 247. 21.
0. 0. 0. 0.]
[ 6. 0. 2. 2. 0. 173. 247. 252. 250. 28. 10. 0.
0. 8. 0. 0. 0. 8. 0. 67. 249. 255. 255. 12.
0. 0. 0. 0.]
[ 0. 0. 6. 3. 0. 88. 255. 251. 255. 188. 21. 0.
15. 0. 8. 2. 16. 0. 35. 200. 247. 251. 134. 4.
0. 0. 0. 0.]
[ 0. 3. 3. 1. 0. 11. 211. 247. 249. 255. 189. 76.
0. 0. 4. 0. 2. 0. 169. 255. 255. 247. 47. 0.
0. 0. 0. 0.]
[ 0. 6. 0. 0. 2. 0. 59. 205. 255. 240. 255. 182.
41. 56. 28. 33. 42. 239. 246. 251. 238. 157. 0. 1.
0. 0. 0. 0.]
[ 2. 1. 0. 0. 2. 10. 0. 104. 239. 255. 240. 255.
253. 247. 237. 255. 255. 250. 255. 239. 255. 100. 0. 1.
0. 0. 0. 0.]
[ 1. 0. 3. 0. 0. 7. 0. 4. 114. 255. 255. 255.
255. 247. 249. 253. 251. 254. 237. 251. 89. 0. 0. 1.
0. 0. 0. 0.]
[ 0. 0. 9. 0. 0. 1. 13. 0. 14. 167. 255. 246.
253. 255. 255. 254. 242. 255. 244. 61. 0. 19. 0. 1.
0. 0. 0. 0.]
[ 2. 1. 7. 0. 0. 4. 0. 14. 0. 27. 61. 143.
255. 255. 252. 255. 149. 21. 6. 16. 0. 0. 7. 0.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0.]]
weights for convlayer 1:
[[[-0.01216923 -0.00584966 0.04876327 0.04628595 0.05644253]
[-0.03813031 -0.0304277 0.05728934 -0.01358741 -0.02875361]
[ 0.04929296 0.05958448 0.05497736 0.04699187 -0.04964543]
[ 0.01874465 0.05793848 0.03988833 -0.02355133 -0.05672331]
[ 0.03986748 -0.06098319 0.01299825 -0.00239702 -0.01750711]]]
[[[-0.02474246 0.0423619 -0.02130952 0.00718671 0.02677802]
[ 0.04151089 0.04336411 -0.03549197 -0.01935773 0.04035303]
[ 0.01466489 -0.01117737 0.0081063 0.01310948 0.01900553]
[-0.01723775 0.0148552 -0.03563556 -0.04108806 0.01764391]
[ 0.03932499 -0.00911049 0.00443425 -0.0388128 0.01646769]]
...........
...........
weights at convlayer 2:
[[-0.02894977 -0.00163836 0.0416469 -0.00195158 0.03194728]
[ 0.02618844 -0.00961595 -0.03348994 0.04460359 0.03113144]
[ 0.04166139 -0.02487885 0.02173471 -0.00147136 0.00803713]
[ 0.02262536 -0.03310476 -0.00949261 -0.0450313 0.03128755]
[-0.01181284 0.00558957 -0.02410718 0.01706195 0.01151338]]
[[ 0.04118888 -0.01306432 -0.01013332 0.03423443 0.03135569]
[ 0.00471491 0.02169717 0.00583819 -0.02421325 -0.01708062]
[-0.01244262 -0.00934037 0.00605259 -0.03825137 -0.00606101]
[-0.01699741 0.01311037 0.0307442 0.04153474 -0.00470464]
[-0.02592571 -0.01203504 0.04052782 0.03150989 0.02740532]]
.........
.........
The weights were initialized using Xavier initialization:
n_in=28*28
n_out = 24*24
w_bound = numpy.sqrt(6./float(n_in+n_out))
filters = numpy.random.uniform(-w_bound,w_bound,(40,20, 5,5))