0
votes

I have a dataset with different observations on the same day. The problem is that sometimes (at weekends or public holidays) I don`t have observations.

I now want to "create" new observations for the missing weekend days which have values of the last available day. Any ideas how to solve this?

My idea was to use the expand command but I don't know how to make sure to tackle all existing observations and keep the "new" date variable.

This is an short extract of my dataset:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str15 id str45 Name float(date2 abn_ret) str52 datecountr
"FRANL0000235190" "AIRBUS "                                       21917     1.2675284 "2020-01-03France"
"FRAFR0000120503" "BOUYGUES "                                     21917     -.6827821 "2020-01-03France"
"FRAFR0000120321" "L'OREAL "                                      21917    -.12522952 "2020-01-03France"
"FRAFR0000131104" "BNP PARIBAS "                                  21917     -.5212415 "2020-01-03France"
"FRAFR0000130809" "SOCIETE GENERALE "                             21917     -.1099271 "2020-01-03France"
"FRAFR0000121667" "ESSILORLUXOTTICA "                             21917    -.09078185 "2020-01-03France"
"FRAFR0000130650" "DASSAULT SYSTEMES "                            21917     1.1218133 "2020-01-03France"
"FRAFR0000120172" "CARREFOUR "                                    21917     .50933045 "2020-01-03France"
"FRAFR0000124141" "VEOLIA ENVIRONNEMENT "                         21917      .8477103 "2020-01-03France"
"FRAFR0000073272" "SAFRAN "                                       21917      1.201697 "2020-01-03France"
"FRAFR0010220475" "ALSTOM "                                       21917      .4248211 "2020-01-03France"
"FRAFR0000121485" "KERING "                                       21917      .9018646 "2020-01-03France"
"FRAFR0000121014" "LVMH "                                         21917     .23757583 "2020-01-03France"
"FRAFR0000051732" "ATOS "                                         21917      .1610538 "2020-01-03France"
"FRAFR0000133308" "ORANGE "                                       21917      .7512742 "2020-01-03France"
"FRAFR0000121329" "THALES SA "                                    21917      1.810375 "2020-01-03France"
"FRAFR0000130577" "PUBLICIS GROUPE "                              21917      .8140349 "2020-01-03France"
"FRAFR0013326246" "UNIBAIL RODAMCO WE STAPLED UNITS "             21917     1.0650123 "2020-01-03France"
"FRAFR0000120644" "DANONE "                                       21917      .9038228 "2020-01-03France"
"FRAFR0000131906" "RENAULT "                                      21917     -.8272043 "2020-01-03France"
"FRAFR0000120578" "SANOFI "                                       21917      .5567737 "2020-01-03France"
"FRAFR0010208488" "ENGIE SA "                                     21917     .28283036 "2020-01-03France"
"FRAFR0000120073" "L AIR LQE.SC.ANYME. POUR L ETUDE ET L EPXTN. " 21917     -.1608524 "2020-01-03France"
"FRAFR0000125007" "SAINT GOBAIN "                                 21917    -1.0007842 "2020-01-03France"
"FRAFR0000125486" "VINCI "                                        21917    -.15338984 "2020-01-03France"
"FRALU1598757687" "ARCELORMITTAL SA "                             21917     -2.524267 "2020-01-03France"
"FRANL0000226223" "STMICROELECTRONICS "                           21917    -.14752163 "2020-01-03France"
"FRAFR0000120693" "PERNOD RICARD"                                 21917      .6558701 "2020-01-03France"
"FRAFR0011981968" "WORLDLINE "                                    21917     1.1493022 "2020-01-03France"
"FRAFR0000121261" "MICHELIN "                                     21917      -.122634 "2020-01-03France"
"FRAFR0000120271" "TOTAL "                                        21917      1.696605 "2020-01-03France"
"FRAFR0000052292" "HERMES INTL. "                                 21917      .8386641 "2020-01-03France"
"FRAFR0000051807" "TELEPERFORMANCE "                              21917      .4268423 "2020-01-03France"
"FRAFR0000121972" "SCHNEIDER ELECTRIC "                           21917     .07140999 "2020-01-03France"
"FRAFR0000121501" "PEUGEOT SA "                                   21917     -1.284413 "2020-01-03France"
"FRAFR0000125338" "CAPGEMINI "                                    21917      .7670811 "2020-01-03France"
"FRAFR0000045072" "CREDIT AGRICOLE "                              21917     .11521439 "2020-01-03France"
"FRAFR0000120628" "AXA "                                          21917      .6657857 "2020-01-03France"
"FRAFR0010307819" "LEGRAND "                                      21917     .13902164 "2020-01-03France"
"FRAFR0000127771" "VIVENDI "                                      21917     .52316546 "2020-01-03France"
""                ""                                              21918             . "2020-01-04France"
""                ""                                              21919             . "2020-01-05France"
"FRAFR0000127771" "VIVENDI "                                      21920    -1.7643152 "2020-01-06France"
"FRALU1598757687" "ARCELORMITTAL SA "                             21920    -1.8255613 "2020-01-06France"
"FRAFR0000121485" "KERING "                                       21920    -1.8539088 "2020-01-06France"
"FRAFR0000130577" "PUBLICIS GROUPE "                              21920      .7106112 "2020-01-06France"
"FRAFR0000133308" "ORANGE "                                       21920     -.8003055 "2020-01-06France"
"FRAFR0000120628" "AXA "                                          21920    -1.4539047 "2020-01-06France"
"FRAFR0000120321" "L'OREAL "                                      21920     -.6811058 "2020-01-06France"
"FRAFR0010307819" "LEGRAND "                                      21920    -.28875658 "2020-01-06France"
"FRAFR0010220475" "ALSTOM "                                       21920      -1.14686 "2020-01-06France"
"FRAFR0000121014" "LVMH "                                         21920    -1.2367407 "2020-01-06France"
"FRAFR0000125007" "SAINT GOBAIN "                                 21920     -.9835446 "2020-01-06France"
"FRAFR0000121972" "SCHNEIDER ELECTRIC "                           21920     -.4302288 "2020-01-06France"
"FRAFR0000051807" "TELEPERFORMANCE "                              21920     -.4520428 "2020-01-06France"
"FRAFR0000120503" "BOUYGUES "                                     21920     -1.765694 "2020-01-06France"
"FRAFR0000125338" "CAPGEMINI "                                    21920    -1.5433754 "2020-01-06France"
"FRAFR0000052292" "HERMES INTL. "                                 21920     -.9977172 "2020-01-06France"
"FRAFR0000131906" "RENAULT "                                      21920   -.014738875 "2020-01-06France"
"FRAFR0000121667" "ESSILORLUXOTTICA "                             21920     -.2962905 "2020-01-06France"
"FRAFR0000120172" "CARREFOUR "                                    21920     -.2928175 "2020-01-06France"
"FRAFR0000130809" "SOCIETE GENERALE "                             21920    -1.0318313 "2020-01-06France"
"FRAFR0000120693" "PERNOD RICARD"                                 21920    -.14599076 "2020-01-06France"
"FRAFR0000073272" "SAFRAN "                                       21920    -1.1589345 "2020-01-06France"
"FRAFR0000130650" "DASSAULT SYSTEMES "                            21920     -.6271888 "2020-01-06France"
"FRANL0000235190" "AIRBUS "                                       21920     -.4630721 "2020-01-06France"
"FRAFR0000125486" "VINCI "                                        21920     -.7578011 "2020-01-06France"
"FRAFR0000121329" "THALES SA "                                    21920      .9807022 "2020-01-06France"
"FRAFR0000121501" "PEUGEOT SA "                                   21920    -1.8891108 "2020-01-06France"
"FRAFR0000120578" "SANOFI "                                       21920      .3342294 "2020-01-06France"
"FRAFR0000120073" "L AIR LQE.SC.ANYME. POUR L ETUDE ET L EPXTN. " 21920     -.9298104 "2020-01-06France"
"FRAFR0000051732" "ATOS "                                         21920    -1.1323103 "2020-01-06France"
"FRAFR0011981968" "WORLDLINE "                                    21920    -.12535486 "2020-01-06France"
"FRAFR0000120271" "TOTAL "                                        21920    -.10319567 "2020-01-06France"
"FRAFR0000121261" "MICHELIN "                                     21920     -.2037876 "2020-01-06France"
"FRAFR0013326246" "UNIBAIL RODAMCO WE STAPLED UNITS "             21920    -2.0658984 "2020-01-06France"
"FRANL0000226223" "STMICROELECTRONICS "                           21920     -2.323978 "2020-01-06France"
"FRAFR0000120644" "DANONE "                                       21920   -.018071305 "2020-01-06France"
"FRAFR0000124141" "VEOLIA ENVIRONNEMENT "                         21920     -.7432131 "2020-01-06France"
"FRAFR0010208488" "ENGIE SA "                                     21920     -.1336275 "2020-01-06France"
"FRAFR0000131104" "BNP PARIBAS "                                  21920     -.7529884 "2020-01-06France"
"FRAFR0000045072" "CREDIT AGRICOLE "                              21920    -1.2134135 "2020-01-06France"
"FRAFR0010307819" "LEGRAND "                                      21921    -.18060634 "2020-01-07France"
"FRANL0000226223" "STMICROELECTRONICS "                           21921      2.992747 "2020-01-07France"
"FRAFR0000121501" "PEUGEOT SA "                                   21921       .955547 "2020-01-07France"
"FRAFR0000120321" "L'OREAL "                                      21921     -.9054412 "2020-01-07France"
"FRAFR0000121329" "THALES SA "                                    21921 -.00028915447 "2020-01-07France"
"FRAFR0000127771" "VIVENDI "                                      21921     .12620287 "2020-01-07France"
"FRAFR0000120271" "TOTAL "                                        21921     -.5725648 "2020-01-07France"
"FRAFR0000120073" "L AIR LQE.SC.ANYME. POUR L ETUDE ET L EPXTN. " 21921    -.11466576 "2020-01-07France"
"FRAFR0000121667" "ESSILORLUXOTTICA "                             21921      .9910967 "2020-01-07France"
"FRANL0000235190" "AIRBUS "                                       21921     -.7487333 "2020-01-07France"
"FRAFR0000051732" "ATOS "                                         21921      4.276569 "2020-01-07France"
"FRAFR0000124141" "VEOLIA ENVIRONNEMENT "                         21921     -.6024734 "2020-01-07France"
"FRAFR0000121972" "SCHNEIDER ELECTRIC "                           21921     -.0850703 "2020-01-07France"
"FRAFR0000120693" "PERNOD RICARD"                                 21921    -.08321261 "2020-01-07France"
"FRAFR0000131104" "BNP PARIBAS "                                  21921    -.04950584 "2020-01-07France"
"FRAFR0000125338" "CAPGEMINI "                                    21921      .4089132 "2020-01-07France"
"FRAFR0011981968" "WORLDLINE "                                    21921     1.6101203 "2020-01-07France"
"FRAFR0000120578" "SANOFI "                                       21921   -.012580408 "2020-01-07France"
end
format %tdCCYY-NN-DD date2
1

1 Answers

0
votes

The best approach to missing data for weekends and public holidays is to create a business calendar so that, in effect, time stops then. In Stata, help bcal.

What you have in mind is problematic:

  1. The sample size is spuriously inflated by adding about 40% to the number of observations.

  2. Any assumption about interpolation is arbitrary and artificial. Last observed value carried forward? Linear interpolation? Cubic interpolation? Spline interpolation?

  3. There are side-effects for any model fit or inference in that variability and dependence structure are partly fudged.

That said, the general pattern for what you want is

egen numid = group(id), label 
tsset numid date2 
tsfill 
bysort numid (date2) : replace abn_ret = abn_ret[_n-1] if missing(abn_ret)