Keras LSTM Multiclass Classification structure

Question

I am a beginner in machine learning and have been trying to use an LSTM to classify according to 12 features into 4 classes. I've followed quite a few tutorials but I'm still a bit confused. My dataset has 12 columns i want to use for training, including the label column which has the values that correspond to each class.

0 = Class 1

1 = Class 2

2 = Class 3

3 = Class 4

and this is my code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
import time
# For LSTM model
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout
from keras.callbacks import EarlyStopping
from keras import optimizers

# Load dataset
train = pd.read_csv("C:\Users\O\Documents\Datasets\FinalDataset2.csv")

train_proccessed = train.iloc[:, 1:13]

scaler = MinMaxScaler(feature_range = (0, 1))
train_scaled = scaler.fit_transform(train_proccessed)

features_set = []
labels = []
for i in range(1, 393763):
    features_set.append(train_scaled[i-1:i, 0])
    labels.append(train_scaled[i, 0])

features_set, labels = np.array(features_set), np.array(labels)

features_set = np.reshape(features_set, (features_set.shape[0], features_set.shape[1], 1))


# Initialize LSTM model
model = Sequential()

model.add(LSTM(512, return_sequences=True,  activation='tanh', input_shape=(features_set.shape[1], 1)))
model.add(Dropout(0.2))
model.add(Dense(4, activation='softmax'))
model.add(LSTM(units=1, activation='sigmoid'))
opt = optimizers.Adam(lr=0.0001)
model.compile(optimizer = opt , loss = 'categorical_crossentropy', metrics = ['accuracy'])

model.fit(features_set, labels, epochs = 100, batch_size = 512)

I am very unsure about whether my model is built correctly or not. Moreover it only yields very low accuracies (27-28%). Any help would be greatly appreciated!!

I've also tried a lot of different hyperparameters, just thought that's worth mentioning — Omar Attia El Sayed Attia
I think its an issue with the way you derive the features_set and labels. Couldn't you just extract all the predictor variables and the target variables out from the dataframe? — yudhiesh
Like I said I'm pretty new to machine learning haha. How would I go about doing that and how would it help? — Omar Attia El Sayed Attia

Maged Maged · Accepted Answer · 2021-01-31T08:22:50

Short Answer:

Last Layer to be Dense(4, activation='softmax')
Labels must be one hot encoded as you are using loss='categorical_crossentropy'

Here are more notes to help

1st Layer

LSTM(512, return_sequences=True,  activation='tanh')

You started with huge LSTM units while your data is just 12 columns.
return_sequences=True which is not justified in your case as you are not staking another layer after it

Model Body

No layers in the middle between LST & final Dense()
Add one Dense layer at least

Output Layer

It could be easier to use loss as sparse_categorical_crossentropy instead of categorical_crossentropy so labels could be passes as numbers otherwise you need to one hot them

Keras LSTM Multiclass Classification structure

1 Answers