I was trying to use the following code to do calculate cumulative return ():
retain MIDPRICE CUM_RETURN;
LAG_MIDPRICE = lag(MIDPRICE);
LAG_CUMRETURN = lag(CUM_RETURN);
return_sec = (MIDPRICE - LAG_MIDPRICE) / LAG_MIDPRICE;
if first.symbol then CUM_RETURN = 0;
else CUM_RETURN = return_sec + LAG_CUMRETURN;
However, with the if and else statement, the SAS is skipping a row:
+--------+----------+------+--------------+--------------+-------------+----------+----------------+
| SYMBOL | DATE | time | CUM_RETURN | return_sec | RMIN | one_M | MK_RETURN_RATE |
+--------+----------+------+--------------+--------------+-------------+----------+----------------+
| A | 20130108 | 1 | 0 | | 0.00023571 | 1.90E-11 | 3.130243764 |
| A | 20130108 | 2 | | -0.00117855 | 0.000235988 | 1.90E-11 | 0.000274509 |
| A | 20130108 | 3 | 0.000471976 | 0.000471976 | 0.000235877 | 1.90E-11 | 6.86083E-05 |
| A | 20130108 | 4 | | -0.000471754 | 0.000235988 | 1.90E-11 | 6.86036E-05 |
| A | 20130108 | 5 | -0.000471976 | -0.000943953 | 0.000236211 | 1.90E-11 | 6.85989E-05 |
| A | 20130108 | 6 | | -0.002362112 | 0.000236771 | 1.90E-11 | 0 |
| A | 20130108 | 7 | 0.000711876 | 0.001183852 | 0.000236491 | 1.90E-11 | -0.000137188 |
| A | 20130108 | 8 | | 0.001300698 | 0.000236183 | 1.90E-11 | 0 |
| A | 20130108 | 9 | 0.000711876 | 0 | 0.000236183 | 1.90E-11 | 0 |
| A | 20130108 | 10 | | 0 | 0.000236183 | 1.90E-11 | 0.000137207 |
| A | 20130108 | 11 | 0.000711876 | 0 | 0.000236183 | 1.90E-11 | 0.000137188 |
| A | 20130108 | 12 | | 0.000590458 | 0.000236044 | 1.90E-11 | 6.85848E-05 |
| A | 20130108 | 13 | 0.000711876 | 0 | 0.000236044 | 1.90E-11 | 0 |
| A | 20130108 | 14 | | -0.000118022 | 0.000236072 | 1.90E-11 | -0.0003429 |
| A | 20130108 | 15 | 0.000711876 | 0 | 0.000236072 | 1.90E-11 | -0.000068604 |
+--------+----------+------+--------------+--------------+-------------+----------+----------------+
As you can see, I want CUM_RETURN = return_sec + lag(CUM_RETURN), but it seems that now it is doing CUM_RETURN = return_sec + lag(lag(CUM_RETURN)).
I am aware of the issue that you cannot write lag directly in if and else conditions, that is why i used a LAG variable before the if else condition. But it seems that it is still working in a weird way ...
Besides, if i delete the if statement and do
if first.symbol then CUM_RETURN = 0;
CUM_RETURN = return_sec + LAG_CUMRETURN;
The entire column of CUM_RETURN just becomes empty ...