I have a transaction level dataset and I want to collapse and calculate weekly average price. The dataset can be simplified as follows,
clear
input str9 date quantity price id
"01jan2010" 50 70 1
"02jan2010" 60 80 2
"02jan2010" 70 90 3
"04jan2010" 70 95 4
"08jan2010" 60 81 5
"09jan2010" 70 88 6
"12jan2010" 55 87 7
"13jan2010" 52 88 8
end
gen date2=date(date,"DMY")
format date2 %td
drop date
I want to create a variable date3
. For every transaction happened in a week, date3
is the Monday of that week.
Here's the code I have:
sort date2
gen date3=date2 if dow(date2)==1
replace date3=date3[_n-1] if missing(date3)
format date3 %td
However, there are Mondays with no transactions, but the rest of the week has transactions. In those cases, date3
is not the Monday date of that week, but Monday date in the weeks before.
My data becomes the following using the above code:
quantity price id date2 date3
50 70 1 01jan2010
60 80 2 02jan2010
70 90 3 02jan2010
70 95 4 04jan2010 04jan2010
60 81 5 08jan2010 04jan2010
70 88 6 09jan2010 04jan2010
55 87 7 12jan2010 04jan2010
52 88 8 13jan2010 04jan2010
To me, it does not matter if id
=1,2,3 have no date3
. What I am concerned is that id
=7 and id
=8 should have a date3
of 11jan2010. But because there is no transaction on that day, the date becomes 04jan2010. Is there a way to fix this?
(I was thinking of constructing a new dataset with consecutive dates since 01jan2010 and then merge
with the one above, and then drop
if missing quantity of price. But I was wondering if there's a more efficient way).
In addition, I have a weekly index data that reports on every Friday since 01jan2010. If I use wofd
command, Stata will generate 53 weeks in 2010. (Or more precisely, two 2010w52.) How can I get just 52 weeks in Stata?
(I found this http://www.stata.com/statalist/archive/2012-02/msg01030.html but I still cannot figure out how this can help solve my problem. )
merge
is not as bad as you think. I would encourage you to try that path and report any problems. – Roberto Ferrer