0
votes

I tried to google a question like this: How to transform weekly data for time series analysis in Pandas?

This question is hard to search without results that talk straight about re-sampling data from daily to weekly or something along those lines.

My question is really more to do with source data already in the form of weekly numerical data, but no time or date data like a datetime stamp.

Here is the form: (Please use the vertical bars for logical alignment of each row.)

Unique_Entity(string) | WK1(float64) | WK2(float64) | WK3(float64)| ...
UE1 | 123 | 234 | 345 | ...
UE2 | 456 | 567 | 678 | ...
UE3 | 789 | 890 | 901 | ...
... | ... | ... | ... | ...

Also WK1 is a "dynamic" description to indicate the numerical data is last week, WK2 is two weeks ago, WK3 is three weeks ago, and so on. So next week WK1's data will shift to WK2 and new data will be added to WK1. Hope that makes sense from my description.

With this being the source data format, I'd like to analyze this live data using time series tools provided by pandas and other python modules. A lot of them use an explicit date column to get their claws in for the rest of the analysis.

Wrap-Up Question: How do I transform or prepare my source data so that these tools can be easily used? (Apart from my naive solution below.)

Naive Solution: I could tag the date of the Monday (Or Friday) every week going backwards. (A function that uses today's date to then generate the dates of every Monday (Or Friday) going back.) Then I could point those time series tools to use those dates and re-sample as weeks.

This is assuming I've un-pivoted the horizontal headers so that WK1 will join with last Monday's (Or Friday's) date and so forth.

1
Why not create a sep df,store it in SQL and have a date to week lookup,then all you need to do is merge your df after you've melted/transformed it - Umar.H
Yes, I could and and probably would do something to that effect as part of the "Naive Solution" I posted. I forgot to mention that I already have a dynamic SQL View that tags each {WK1, WK2, ...} column header to a {2019, 2018, ...}, {Q1,Q2,...}, and {M1, M2, ...} set that I could easily also add a column that has a Monday (or Friday) date. But my question is whether that is the best way to do it or if there is a best practices for data of this form. Or is there a data parser that I can use on the incoming source data so it can seamlessly be used in Time Series modules, functions, or tools? - ssurendr

1 Answers

2
votes

Create a DatetimeIndex ending today, with 1 week period in reverse, and assign it to the columns:

df.columns = pd.date_range(end=datetime.date.today(), periods=len(df.columns),
                           freq='1W-MON')[::-1]

It gives:

     2019-06-10  2019-06-03  2019-05-27
UE1         123         234         345
UE2         456         567         678
UE3         789         890         901

Transpose the result if needed.