I have sequence data that tells me what color was observed for multiple subjects at different points in time. For example:
ID | Time | Color |
---|---|---|
A | 1 | Blue |
A | 2 | Red |
A | 5 | Red |
B | 3 | Blue |
B | 6 | Green |
C | 1 | Red |
C | 3 | Orange |
I want to obtain predictions for the most likely color for the next 3 time steps, as well as the probability of that color appearing. For example, for ID A, I'd like to know the next 3 items (time, color) in the sequence as well as its probability of the color appearing.
I understand that LSTMs are often used to predict this type of sequential data, and that I would feed in a 3d array like
input =[
[[1,1], [2,2], [5,2]], #blue at t=1, red at t=2, red at t=5 for ID A
[[0,0], [3,1], [6,3]], #nothing for first entry, blue at t=3, green at t=6 for ID B
[[0,0], [1,2], [3,4]]
]
after mapping the colors to numbers (Blue-> 1, Red->2, Green-> 3, Orange -> 4etc.). My understanding is that, by default, the LSTM just predicts the next item in each sequence, so for example
output = [
[[7, 2]], #next item is most likely red at t=7
[[9, 3]], # next item is most likely red at t=9
[[6, 2]]
]
Is it possible to modify the output of my LSTM so that instead of just predicting the next occurence time and color, I can get the next 3 times, colors AND probabilities of the color appearing? For example, an output like
output = [
[[7, 2, 0.93], [8,2, 0.79], [10,4, 0.67]],
[[9, 2, 0.88], [11,3, 0.70], [14,3, 0.43]],
...
]
I've tried looking in the Sequential
documentation for Keras, but I'm not sure if I've found anything.
Furthermore, I see that there's a TrainX and TrainY typically used for model.fit()
but I'm also not sure what my TrainY would be here?
Sequential
is unrelated to sequences, it is just an interface to stack layers (a better name would have beenModel
). – runDOSrun