3
votes

I am trying to use Keras/TF2.3.0 to do multilabel classification where I have 50 features and am classifying between five classes. I am getting the following warning, although the model still trains, which confuses me.


>>> model.fit(train_dataset, epochs=5, validation_data=val_dataset)

Epoch 1/5 WARNING:tensorflow:Model was constructed with shape (128, 1, 50) for input Tensor("input_1:0", shape=(128, 1, 50), dtype=float32), but it was called on an input with incompatible shape (None, 50).

WARNING:tensorflow:Model was constructed with shape (128, 1, 50) for input Tensor("input_1:0", shape=(128, 1, 50), dtype=float32), but it was called on an input with incompatible shape (None, 50).

1/5[..............................] - ETA: 0s - loss: 0.6996WARNING:tensorflow:Model was constructed with shape (128, 1, 50) for input Tensor("input_1:0", shape=(128, 1, 50), dtype=float32), but it was called on an input with incompatible shape (None, 50). 59/59 [==============================] - 0s 2ms/step - loss: 0.6941 - val_loss: 0.6935

Epoch 2/5 59/59 [==============================]...

My full code, with random data to reproduce the error, is below. What am I messing up with my NN architecture (or perhaps my dfs_to_tfds function?) to accept input records with num_vars features and output values distributed among num_classes classes in TF?

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from tensorflow.keras.layers import Input, Dense, Flatten, Conv1D, AveragePooling1D
from tensorflow.keras.models import Model
import tensorflow as tf

# setup example input data and labels
num_rows = 10_000
num_vars = 50
num_classes = 5
data = np.random.rand(num_rows, num_vars)
labels = np.random.rand(num_rows, num_classes)

# convert input data to TF.data datasets
bs=128
def dfs_to_tfds(features, targets, bs):
  return tf.data.Dataset.from_tensor_slices((features, targets)).batch(bs)

X_train, X_val, y_train, y_val = train_test_split(data, labels)

train_dataset = dfs_to_tfds(X_train, y_train, bs)
val_dataset = dfs_to_tfds(X_val, y_val, bs)

# setup model
inputs = Input(shape = (1, num_vars), batch_size=bs)
h = Dense(units=32, activation='relu')(inputs)
h = Dense(units=32, activation='relu')(h)
h = Dense(units=32, activation='relu')(h)
outputs = Dense(units=num_classes, activation='sigmoid')(h)

model = Model(inputs=inputs, outputs=outputs)

model.compile(optimizer='rmsprop', 
              loss=['binary_crossentropy'], #tf.keras.losses.MSLE
              metrics=None, 
              loss_weights=None, 
              run_eagerly=None)

# train model
model.fit(train_dataset, epochs=5, validation_data=val_dataset)
1

1 Answers

1
votes

Use

inputs = Input(shape=num_vars)

and specify your batch size when fitting the model:

model.fit(train_dataset, epochs=5, validation_data=val_dataset, batch_size=bs)

Your data is not preorganized in subbatches so you dont have to specify it along with the input shape but when fitting. So model.fit automatically takes batches of batch_size out of your input data when fitting the model