3
votes

I'm comparing hourly data measurements recorded for 5 years (2007-2011) where the number of measurements in each year is as follows:

2007 = 8760 measurements;
2008 = 8784 measurements; <-- leap year
2009 = 8760 measurements;
2010 = 8760 measurements;
2011 = 8760 measurements;

What is the best method for comparing each time series? Is it better to add an additional 24 measurements (of nans) for february 29th for the non leap years? or, is it more efficient to interpolate the data onto the same time frame (where time is given in decimal day of year)?

1
What comparison are you trying to make? The data sets don't correspond, so you probably aren't going to be comparing like with like. - walkytalky

1 Answers

2
votes

That depends entirely on the kind of data you are measuring. If it's natural-world stuff like weather data, you probably care more about matching solstice to solstice and equinox to equinox. If it's financial market data, you may want to line up calendars and possibly exclude the leap day entirely.

It's difficult to give more specific advice without more background.