This is a conceptual question about working with time series of various lengths in a deep learning context:
I have observations of standardized features that occur at irregular intervals which include a time based feature in every individual measurement. I then flatten this multivariate time series (panel data) to a single continuous feature vector for each time series. I then build a deep neural network for a binary classification task on these vectors which now look like this:
xxxx(T=2)xxxx(T=4)xxxx(T=5)
xxxx(T=1)xxxx(T=2)
xxxx(T=3)
xxxx(T=1)xxxx(T=2)xxxx(T=3)xxxx(T=5)
And are then end padded with zeros to be the same length.
Each "xxxxT" represents an observation where "x"'s are non-temporal features and "T" is a time based feature. My question is whether it can be assumed that the neural network will be able to discriminate the irregular nature of this time series on its own?
Or should I really pad the intermittent non-occurring observations to look something like this (where the "0000" represent padding the missing observations)?
0000(T=1)xxxx(T=2)0000(T=3)xxxx(T=4)xxxx(T=5)
xxxx(T=1)xxxx(T=2)0000(T=3)0000(T=4)0000(T=5)
0000(T=1)0000(T=2)xxxx(T=3)0000(T=4)0000(T=5)
xxxx(T=1)xxxx(T=2)xxxx(T=3)0000(T=4)xxxx(T=5)
I have actually done this already and examined the results of both approaches. I just wanted to see if anyone could shed some light on how a deep neural network "interprets" this?