Calculating stock's return by a variable interval in data.table

Question

Here is my data.table

structure(list(index = structure(c(17007, 17008, 17009, 17010, 
17011, 17014, 17015, 17016, 17017), tzone = "UTC", tclass = "Date", class = "Date"), 
    close = c(97.339996, 96.669998, 102.949997, 104.339996, 104.209999, 
    106.050003, 104.480003, 105.790001, 105.870003), daily_return = c(-0.688306993560994, 
    6.49632681279251, 1.35016905342893, -0.124589807344833, 1.76566933850562, 
    -1.48043371578217, 1.25382653367649, 0.0756234041438208, 
    1.5207329313101), group = c(0, 0, 3, 3, 3, 3, 3, 3, 3), qq = c(1, 
    0, 6, 5, 4, 3, 2, 1, 0)), class = c("data.table", "data.frame"
), row.names = c(NA, -9L), .internal.selfref = <pointer: 0x558d2914e9a0>, sorted = "index")

and it looks like as follows -

        index  close daily_return group qq
1: 2016-07-25  97.34   -0.6883070     0  1
2: 2016-07-26  96.67    6.4963268     0  0
3: 2016-07-27 102.95    1.3501691     3  6
4: 2016-07-28 104.34   -0.1245898     3  5
5: 2016-07-29 104.21    1.7656693     3  4
6: 2016-08-01 106.05   -1.4804337     3  3
7: 2016-08-02 104.48    1.2538265     3  2
8: 2016-08-03 105.79    0.0756234     3  1
9: 2016-08-04 105.87    1.5207329     3  0

In this data.table, I need to find the return (from column close) by the interval given in qq column for each group. E.g. in row 3, the look forwarding interval is 6 and the return is to be calculated between column 3 and column 9 (3 + 6 lookforward).

When the group changes, the return is the be calculated between the last element of the first group and the first element of the second group. e.g, as the group changes between row 2 and row3, the return is calculated between these two rows.

Following is the desired result -

        index  close daily_return group qq desired_result
1: 2016-07-25  97.34   -0.6883070     0  1          -0.68
2: 2016-07-26  96.67    6.4963268     0  0           6.49
3: 2016-07-27 102.95    1.3501691     3  6           2.83
4: 2016-07-28 104.34   -0.1245898     3  5           1.46
5: 2016-07-29 104.21    1.7656693     3  4           1.59
6: 2016-08-01 106.05   -1.4804337     3  3          -0.16
7: 2016-08-02 104.48    1.2538265     3  2           1.33
8: 2016-08-03 105.79    0.0756234     3  1           0.07
9: 2016-08-04 105.87    1.5207329     3  0             NA

I am sure there must be an easy way to accomplish this using data.table.

I have tried the following and it works only on specific rows as k is constant.

lead(Delt(close, k = 6, type = "arithmetic") * 100, n = 6)

Is there a way to pass the column qq dynamically in the k variable of Delt command? OR, is there any other solution to this problem?

Mini suggestion, more theoretic, perhaps take log return, since stock distribution is more lognormal (black-scholes) than normal, ergo with log you make your returns normal. — JacobJacox
Understood, since I am using arithmetic for the testing purpose only as it is easy to test. Thanks! — Saurabh
Why is the desired result 2.76 for row 3 instead of (105.87-102.95)/102.95=2.84? — chinsoon12
I have fixed the desired result column in the question above. Thanks for checking. — Saurabh

Uwe Uwe · Accepted Answer · 2019-12-31T13:06:32

Here is an approach which uses data.table syntax without looping:

idx2 <- close[, .I + ifelse(group == shift(group, -1L), qq, 1L)]
close[, return := (close[idx2] - close) / close * 100][]

        index  close daily_return group qq     return
1: 2016-07-25  97.34   -0.6883070     0  1 -0.6883070
2: 2016-07-26  96.67    6.4963268     0  0  6.4963268
3: 2016-07-27 102.95    1.3501691     3  6  2.8363342
4: 2016-07-28 104.34   -0.1245898     3  5  1.4663667
5: 2016-07-29 104.21    1.7656693     3  4  1.5929412
6: 2016-08-01 106.05   -1.4804337     3  3 -0.1697313
7: 2016-08-02 104.48    1.2538265     3  2  1.3303981
8: 2016-08-03 105.79    0.0756234     3  1  0.0756234
9: 2016-08-04 105.87    1.5207329     3  0         NA

idx2 contains the row indices which are computed from the local index .I plus qq except when a new group starts in the next row.

idx2

[1]  2  3  9  9  9  9  9  9 NA

The formula to compute return is taken from help("Delt", "quantmod"):

Arithmetic differences are used by default: Lag = (x2(t) - x1(t-k))/x1(t-k)

Therefore, it is unclear to me why OP's expected result is different. See also chinsoon's comment.

Calculating stock's return by a variable interval in data.table

2 Answers