As indicated in the title, I am wondering if the DTW (Dynamic Time Warping) could be used to calculate the DTW distance between two time series with missing values.
Let's say the two time series are daily temperatures of two weather stations, and are of equal lengths (e.g. 365 days), and the missing values are on different days for the two time series.
If this is possible, is the dtw package in R able to handle the missing values? I didn't find a parameter that could be set in dtw() like na.rm = T
.
Thanks a lot!
Thanks thelatemail for the suggestion. Below is a simplified example of the two time series, where each time series contain only 52 elements and the missing values are set to NA
.
TS1 = c(-3.26433, -5.09096, NA, -8.4158, -5.85485, -3.49234, -7.64666, -4.90124, NA, -4.68836, -1.38114, 1.55527, 2.81872, 2.44261, 3.57963, 6.19983, 7.42515, 8.41524, 6.32686, 10.0144, 9.53251, 13.4781, 12.3585, 10.6706, 10.2647, 16.6848, 16.4855, 20.1482, NA, 21.5734, 20.3946, 20.8824, 18.0325, 18.5813, 17.5453, 16.3315, 14.3068, 11.3164, 9.96398, 5.53102, 9.55094, 9.05897, 6.81199, 5.20343, 1.63158, -0.661077, -4.33853, -6.53655, NA, -10.8646, 1.11843, 1.23786)
TS2 = c(-5.76852, -10.2207, -11.8465, NA, -1.70019, -3.60319, -5.7718, -3.81106, -5.62284, -3.57516, 0.314511, 0.64058, 0.476162, NA, 4.23757, 5.15417, 7.29422, NA, 1.57376, 9.28236, 8.05182, 13.7175, 9.5453, 10.2417, 9.32423, 18.214, 18.3726, 16.661, 20.6563, 22.2901, 22.1109, 19.129, 15.8615, 16.7817, 17.247, 15.9921, 14.5804, 11.3693, 10.9349, 10.1196, 3.7467, 9.09229, 6.91285, NA, 4.20934, -0.566403, -2.94184, -3.81432, -10.0212, -15.9876, -2.56286, -1.88976)
dtw
program? I know there are a number of imputation methods, but even something as simple asTS2[is.na(TS2)] <- sapply(which(is.na(TS2)),function(x) mean(c(TS2[x-1],TS2[x+1])))
could work alright. – thelatemail