I'm running into trouble trying to replicate SQL window functions in R, in particular with relation to creating sum totals that specify the number of prior months I want to sum.
While the sqldf package in R allows for data manipulation, it doesn't seem to support window functions.
I have some mock data in R
set.seed(10)
data_1 <- data.table(Cust_ID = c(1,1,1,1,2,2,2,2,3,3,3,3),Month=c(4,3,2,1,4,3,2,1,4,3,2,1),
StatusCode=LETTERS[4:6],SalesValue=round(runif(12,50,1500)))
Cust_ID Month StatusCode SalesValue
1 4 D 786
1 3 E 495
1 2 F 669
1 1 D 1055
2 4 E 173
2 3 F 377
2 2 D 448
2 1 E 445
3 4 F 943
3 3 D 673
3 2 E 995
3 1 F 873
For each row, I would like to create a cumulative sum of values pertaining to the customer (Cust_ID), for the prior 2 months (not including the current month).
This would mean that for each customer, rows with Months 1 & 2 should be null (given there aren't 2 preceding months), Month 3 should contain summed SalesValue of Months 1 & 2 for that customer, and Month 4 should contain summed Sales Value for Month 2 & 3.
In SQL, I would use syntax similar to the following: SUM(SalesValue) OVER (PARTITION BY Cust_ID ORDER BY MONTH DESC ROWS BETWEEN 2 PRECEDING AND 1 PRECEDING ) as PAST_3Y_SALES
Is there to achieve this in R - ideally using data.table (for efficiency)? Any guidance would be much appreciated.
PS Note: this is mock data, in my 'real' data customers have different data volumes - i.e. some customers have 5 months worth of data, others have >36 months worth of data, etc.