I have time-series data of 12 consumers. The data corresponding to 12 consumers (named as a ... l
) is
I want to cluster these consumers so that I may know which of the consumers have utmost similar consumption behavior. Accordingly, I found clustering method pamk, which automatically calculates the number of clusters in input data.
I assume that I have only two options to calculate the distance between any two time-series, i.e., Euclidean, and DTW. I tried both of them and I do get different clusters. Now the question is which one should I rely upon? and why?
When I use Eulidean
distance I got following clusters:
Conclusion: How will you decide which clustering approach is the best in this case?
Note: I have asked the same question on Cross-Validated also.
2.1k
question related to cluster-analysis while as on cross-validated it is only1.6k
. – Haroon Rashid