I am using a AlexNet with pretrained weights(heuritech/convnets-keras) for a classification problem with 8 classes instead of 1000. After initialising the network with Model(input=..,output=..) and loading the initial weights, I drop the last two layers, Dense(1000) and Activation(softmax), and add my own two layers: Dense(8) and Activation(softmax).
But then, after running I get an error
Error when checking model target: expected softmax to have shape (None, 1000) but got array with shape (32, 8)
The 32 is the batch size from the generator I guess, but I don't understand why the softmax still expects 1000 dimensions from the previous layer.
Can someone help me? I think it has something to do the output parameter of the Model, but this is only semi-wild guessing after trying and googling around. Thanks!
Code:
import ...
pp = os.path.dirname(os.path.abspath(__file__))
##### Define Model #####
inputs = Input(shape=(3,227,227))
conv_1 = Convolution2D(96, 11, 11,subsample=(4,4),activation='relu', name='conv_1')(inputs)
...
...
...
dense_1 = MaxPooling2D((3, 3), strides=(2,2),name="convpool_5")(conv_5)
dense_1 = Flatten(name="flatten")(dense_1)
dense_1 = Dense(4096, activation='relu',name='dense_1')(dense_1)
dense_2 = Dropout(0.5)(dense_1)
dense_2 = Dense(4096, activation='relu',name='dense_2')(dense_2)
dense_3 = Dropout(0.5)(dense_2)
dense_3 = Dense(1000,name='dense_3')(dense_3)
prediction = Activation("softmax",name="softmax")(dense_3)
model = Model(input=inputs, output=prediction)
for layer in model.layers[:27]:
print layer.name
layer.trainable = False
model.load_weights(pp+"/weights/alexnet_weights.h5")
print model.output_shape
print model.layers[-1]
model.layers.pop()
print model.output_shape
model.layers.pop()
print model.layers[-1]
print model.output_shape
model.layers.append(Dense(8, activation='softmax',name='dense_4'))
print model.layers[-1]
##### Get Data #####
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
pp+'/dataset/training',
target_size=(227,227),
class_mode='categorical')
validation_generator = test_datagen.flow_from_directory(
pp+'/dataset/test',
target_size=(227,227),
class_mode='categorical')
##### Compile and Fit ####
sgd = SGD(lr=1e-4, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(optimizer=sgd, loss='mse')
model.fit_generator(
train_generator,
samples_per_epoch=500,
nb_epoch=5,
validation_data=validation_generator,
nb_val_samples=150)
model.save_weights('first_try.h5')