I have been developing feedforward neural networks (FNNs) and recurrent neural networks (RNNs) in Keras with structured data of the shape [instances, time, features]
, and the performance of FNNs and RNNs has been the same (except that RNNs require more computation time).
I have also simulated tabular data (code below) where I expected a RNN to outperform a FNN because the next value in the series is dependent on the previous value in the series; however, both architectures predict correctly.
With NLP data, I have seen RNNs outperform FNNs, but not with tabular data. Generally, when would one expect a RNN to outperform a FNN with tabular data? Specifically, could someone post simulation code with tabular data demonstrating a RNN outperforming a FNN?
Thank you! If my simulation code is not ideal for my question, please adapt it or share a more ideal one!
from keras import models
from keras import layers
from keras.layers import Dense, LSTM
import numpy as np
import matplotlib.pyplot as plt
Two features were simulated over 10 time steps, where the value of the second feature is dependent on the value of both features in the prior time step.
## Simulate data.
np.random.seed(20180825)
X = np.random.randint(50, 70, size = (11000, 1)) / 100
X = np.concatenate((X, X), axis = 1)
for i in range(10):
X_next = np.random.randint(50, 70, size = (11000, 1)) / 100
X = np.concatenate((X, X_next, (0.50 * X[:, -1].reshape(len(X), 1))
+ (0.50 * X[:, -2].reshape(len(X), 1))), axis = 1)
print(X.shape)
## Training and validation data.
split = 10000
Y_train = X[:split, -1:].reshape(split, 1)
Y_valid = X[split:, -1:].reshape(len(X) - split, 1)
X_train = X[:split, :-2]
X_valid = X[split:, :-2]
print(X_train.shape)
print(Y_train.shape)
print(X_valid.shape)
print(Y_valid.shape)
FNN:
## FNN model.
# Define model.
network_fnn = models.Sequential()
network_fnn.add(layers.Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))
network_fnn.add(Dense(1, activation = None))
# Compile model.
network_fnn.compile(optimizer = 'adam', loss = 'mean_squared_error')
# Fit model.
history_fnn = network_fnn.fit(X_train, Y_train, epochs = 10, batch_size = 32, verbose = False,
validation_data = (X_valid, Y_valid))
plt.scatter(Y_train, network_fnn.predict(X_train), alpha = 0.1)
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.show()
plt.scatter(Y_valid, network_fnn.predict(X_valid), alpha = 0.1)
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.show()
LSTM:
## LSTM model.
X_lstm_train = X_train.reshape(X_train.shape[0], X_train.shape[1] // 2, 2)
X_lstm_valid = X_valid.reshape(X_valid.shape[0], X_valid.shape[1] // 2, 2)
# Define model.
network_lstm = models.Sequential()
network_lstm.add(layers.LSTM(64, activation = 'relu', input_shape = (X_lstm_train.shape[1], 2)))
network_lstm.add(layers.Dense(1, activation = None))
# Compile model.
network_lstm.compile(optimizer = 'adam', loss = 'mean_squared_error')
# Fit model.
history_lstm = network_lstm.fit(X_lstm_train, Y_train, epochs = 10, batch_size = 32, verbose = False,
validation_data = (X_lstm_valid, Y_valid))
plt.scatter(Y_train, network_lstm.predict(X_lstm_train), alpha = 0.1)
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.show()
plt.scatter(Y_valid, network_lstm.predict(X_lstm_valid), alpha = 0.1)
plt.xlabel('Actual')
plt.ylabel('Predicted')
plt.show()