I am exploring deep learning methods especially LSTM to predict next word. Suppose, My data set is like this: Each data point consists of 7 features (7 different words)(A-G here) of different length.
Group1 Group2............ Group 38
A B F
E C A
B E G
C D G
C F F
D G G
. . .
. . .
I used one hot encoding as an Input layer. Here is the model
main_input= Input(shape=(None,action_count),name='main_input')
lstm_out= LSTM(units=64,activation='tanh')(main_input)
lstm_out=Dropout(0.2)(lstm_out)
lstm_out=Dense(action_count)(lstm_out)
main_output=Activation('softmax')(lstm_out)
model=Model(inputs=[main_input],outputs=main_output)
print(model.summary())
Using this model. I got an accuracy of about 60%. My question is how can I use embedding layer for my problem. Actually, I do not know much about embedding (why, when and how it works)[I only know one hot vector does not carry much information]. I am wondering if embedding can improve accuracy. If someone can provide me guidance in these regards, it will be greatly beneficial for me. (At least whether uses of embedding is logical or not for my case)