I try to create a very simple neural network: one hidden layer, with 2 neurons. For some very simple data: only one feature.
import numpy as np
X=np.concatenate([np.linspace(0,10,100),np.linspace(11,20,100),np.linspace(21,30,100)])
y=np.concatenate([np.repeat(0,100),np.repeat(1,100),np.repeat(0,100)])
Here is the model
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(2, activation='sigmoid'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit(X, y, epochs=200)
In theory, this model should work. But even after 1000 epochs, the accuracy is still 0.667.
Epoch 999/1000
10/10 [==============================] - 0s 1ms/step - loss: 0.5567 - accuracy: 0.6667
Epoch 1000/1000
10/10 [==============================] - 0s 2ms/step - loss: 0.5566 - accuracy: 0.6667
I think that I did something wrong. Could you suggest some modification?
It seems that there are a lot of local minimums and the initialization can change the final model. It is the case when testing with the package nnet in R. I had to test lots of seeds, I found this model (among others).
And this is the structure that I wanted to create with keras: one hidden layer with 2 neurons. The activation function is sigmoid.
So I am wondering if keras has the same problem with initialization. With this package nnet in R, I thought that it is not a "perfect" package. And I thought keras would be more performant. If the initialization is important, does keras test different initialization? If not why ? Maybe because in general, with more data (and more features), it works better (without testing many initializations)?
For example, with kmeans, it seems that different initializations are tested.


