1
votes

I am willing to create a new variable called recency - how recent is the transaction of the customer - which is useful for RFM analysis. The definition is as follows: We observe transaction log of each customer weekly and assign dummy variable called "trans" if the customers made a transaction. Recency variable will equal to the number of the week if she made a transaction at that week, otherwise recency will be equal to the previous recency value. To make it more clear, I have also created a demo data.table for you.

demo<-data.table( cust=rep(c(1:3), 3))
demo[,week:=seq(1,3,1),by=cust]
demo[, trans:=c(1,1,1,0,1,0,1,1,0)]
demo[, rec:=c(1,1,1, 1,2,1,3,3,1)]

I need to calculate "rec" variable which I entered manually in demo data.table. Please also consider that, I can handle it with looping which takes a lot of time. Therefore, I would be grateful if you help me with data.table way. Thanks in advance.

1

1 Answers

1
votes

This works for the example:

demo[, v := cummax(week*trans), by=cust]

   cust week trans rec v
1:    1    1     1   1 1
2:    2    1     1   1 1
3:    3    1     1   1 1
4:    1    2     0   1 1
5:    2    2     1   2 2
6:    3    2     0   1 1
7:    1    3     1   3 3
8:    2    3     1   3 3
9:    3    3     0   1 1

We observe transaction log of each customer weekly and assign dummy variable called "trans" if the customers made a transaction. Recency variable will equal to the number of the week if she made a transaction at that week, otherwise recency will be equal to the previous recency value.

This means taking the cumulative max week, ignoring weeks where there is no transaction. Since weeks are positive numbers, we can treat the no-transaction weeks as zero.