I'm experimenting with a model combining a convolutional neural network with a linear model. Here is a simplified version of it:
from tensorflow.keras import Sequential
from tensorflow.keras.experimental import WideDeepModel, LinearModel
num_classes = 1 ##(0='NO' or 1='YES')
cnn_model.Sequential()
cnn_model.add(Conv1D(20, 8, padding='same', activation='relu'))
cnn_model.add(GlobalAveragePooling1D())
cnn_model.add(Dropout(0.6))
cnn_model.add(Dense(num_classes, activation='sigmoid'))
linear_model = LinearModel()
combined_model = WideDeepModel(linear_model, cnn_model)
combined_model.compile(optimizer = ['sgd', 'adam'],
loss = ['mse','binary_crossentropy'],
metrics = ['accuracy'])
Performance is very good and everything seems to be going well until I sorted the predictions by pval
and I can see there are predictions >1 even when I'm using sigmoid activation which is I thought was supposed to bring everything between 0 and 1, and no activation function for the linear model (but inputs are all scaled 0-1):
pred = [ 1 if a > threshold else 0 for a in combined_model.predict([dplus_test, X_test])]
pv = combined_model.predict([dplus_test, X_test])
pval = [a[0] for a in pv]
true pred pval dplus
1633 1 1 1.002850 15.22404
1326 1 1 1.001444 10.34983
1289 1 1 1.001368 10.03043
1371 1 1 1.000986 10.74037
1188 1 1 1.000707 8.902
I checked on the other end of the data, and those predictions are as I expected, always >0.
true pred pval dplus
145 0 0 0.000463 1.81635
383 0 0 0.001023 3.24982
1053 0 0 0.001365 7.22535
This is not a problem so far, nothing crashes and I'm happy with the performance.
I would like to know if my understanding of the sigmoid activation function is wrong or if there is something in the Combined model that allows values to go above 1 and whether I can trust these results.