0
votes

I have a transaction level dataset and I want to collapse and calculate weekly average price. The dataset can be simplified as follows,

clear 
input str9 date quantity price id
"01jan2010" 50 70 1
"02jan2010" 60 80 2
"02jan2010" 70 90 3
"04jan2010" 70 95 4
"08jan2010" 60 81 5
"09jan2010" 70 88 6
"12jan2010" 55 87 7
"13jan2010" 52 88 8
end
gen date2=date(date,"DMY")
format date2 %td
drop date

I want to create a variable date3. For every transaction happened in a week, date3 is the Monday of that week.

Here's the code I have:

sort date2
gen date3=date2 if dow(date2)==1
replace date3=date3[_n-1] if missing(date3)
format date3 %td

However, there are Mondays with no transactions, but the rest of the week has transactions. In those cases, date3 is not the Monday date of that week, but Monday date in the weeks before.

My data becomes the following using the above code:

quantity    price   id  date2       date3
50          70  1   01jan2010   
60          80  2   02jan2010   
70          90  3   02jan2010   
70          95  4   04jan2010   04jan2010
60          81  5   08jan2010   04jan2010
70          88  6   09jan2010   04jan2010
55          87  7   12jan2010   04jan2010
52          88  8   13jan2010   04jan2010

To me, it does not matter if id =1,2,3 have no date3. What I am concerned is that id=7 and id=8 should have a date3 of 11jan2010. But because there is no transaction on that day, the date becomes 04jan2010. Is there a way to fix this?

(I was thinking of constructing a new dataset with consecutive dates since 01jan2010 and then merge with the one above, and then drop if missing quantity of price. But I was wondering if there's a more efficient way).


In addition, I have a weekly index data that reports on every Friday since 01jan2010. If I use wofd command, Stata will generate 53 weeks in 2010. (Or more precisely, two 2010w52.) How can I get just 52 weeks in Stata?

(I found this http://www.stata.com/statalist/archive/2012-02/msg01030.html but I still cannot figure out how this can help solve my problem. )

1
When working with weeks, Stata tip 68: Week assumptions by Nick Cox, is a good read.Roberto Ferrer
merge is not as bad as you think. I would encourage you to try that path and report any problems.Roberto Ferrer
In fact, you should report the complete attempt (i.e. include the code).Roberto Ferrer
Thank you @robertoferrer, this is really helpful!X. Shi

1 Answers

3
votes

Your weeks start on Mondays. Everything you need follows from using dow() to exploit the fact that in every one of your weeks, the day of week function dow() yields 1, 2, 3, 4, 5, 6, 0 for the days from Monday to Sunday.

The present or previous Monday for daily dates daily is just

 gen Monday = cond(dow(daily) == 0, daily - 6, daily - dow(daily) + 1) 

The branch is like this. If it's a Sunday, the previous Monday was 6 days ago. Otherwise, the Monday that starts the week was today if it's Monday and dow() yields 1, yesterday if it's Tuesday and 2, and so forth. Here the variable Monday is just the dates of Mondays that define the weeks.

Important detail: There are no assumptions here about dates being complete in the data or even in order.

Small note: Arbitrary names like date2 and date3 mean nothing much. Use evocative names in your questions (and your practice).

There was a sequel to the article mentioned by Robert Ferrer. search week, sj in Stata to get the references.

Do not use Stata's weeks and in particular do not use the wofd() function (not a command), as they can't help you. Stata's weeks will not map on to your weeks. The article mentioned by Robert Ferrer really is worthwhile reading to understand this (even though I wrote it).

(This is all explained in the Statalist threads you link to.)