0
votes

I'm trying to adjust the text classifier neural net in this Keras/Tensorflow tutorial to output multiple (more than 2 categories). I think I can change the output layer to use a 'softmax' activation but I'm not sure how to adjust the input layer.

Tutorial Link: https://www.tensorflow.org/tutorials/keras/basic_text_classification

The tutorial is using movie review data and the only two categories are positive or negative so the model only uses an output layer with activation set to 'sigmoid'.

I have 16 categories represented using one-hot encoding.

Tutorial Example:

vocab_size = 10000
model = keras.Sequential()
model.add(keras.layers.Embedding(vocab_size, 16))
model.add(keras.layers.GlobalAveragePooling1D())
model.add(keras.layers.Dense(16, activation=tf.nn.relu))
model.add(keras.layers.Dense(1, activation=tf.nn.sigmoid))

My Attempt:

model.add(keras.layers.Embedding(10000, 16))
model.add(keras.layers.GlobalAveragePooling1D())
model.add(keras.layers.Dense(16, activation=tf.nn.relu))
model.add(keras.layers.Dense(16, activation='softmax'))


model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['acc'])

history = model.fit(x_train[:5000],
                y_train[:5000],
                epochs=1,
                batch_size=256,
                validation_data=(x_train[5000:], y_train[5000:]),
                verbose=1)

Error: ValueError: A target array with shape (5000, 1) was passed for an output of shape (None, 16) while using as loss binary_crossentropy. This loss expects targets to have the same shape as the output.

Data Shapes:

x_train[:5000] (5000, 2000)
y_train[:5000] (5000,16)
x_train[5000:] (1934, 2000)
y_train[5000:] (1934,16)

Model Summary: Model: "sequential_16"

Layer (type)                 Output Shape              Param #   
embedding_15 (Embedding)     (None, None, 16)          160000    
global_average_pooling1d_15  (None, 16)                0         
dense_30 (Dense)             (None, 16)                272       
dense_31 (Dense)             (None, 16)                272

Total params: 160,544

Trainable params: 160,544

Non-trainable params: 0

1

1 Answers

1
votes

Binary_crossentropy is for binary classification, but what you're looking for is categorical_crossentropy. Binary_crossentropy expects your y matrix to be (n_samples x 1), with values of 0 or 1. Categorical_crossentropy expects your y matrix to be (n_samples x n_categories), with the correct category labeled as 1 and the other categories labeled as 0. It sounds like the way you did one-hot-encoding would be correct, so you probably just need to change your loss function.

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])