0
votes

I am working with a dataset that has purchases per date (called ItemNum) on multiple dates across 2800 individuals. Each Item is given its own line, so if an individual has purchased two items on a date, that date will appear twice. I don't care how many items were purchased on a date (with each date representing one trip), but rather the mean number of trips made across the 2800 individuals (For about 18230 lines of data). My data looks like this:

+---+----------+-------+---------------------- ---+
|ID |    Date  |ItemNum|        ItemDescript      |
| 1 |01/22/2010|   1   |Description of the item   |
| 1 |01/22/2010|   2   |Description of other item |
| 1 |07/19/2013|   1   |                          | 
| 2 |06/04/2012|   1   |                          |
| 2 |02/02/2013|   1   |                          |
| 2 |11/13/2013|   1   |                          |
+---+----------+-------+---------------------- ---+

In the above table, person 1 made two trips and three item purchases (because two dates are shown), person 2 made three trips. I am interested in the average number of trips across all people, but first I need to collapse it down to unique dates. So I know I need to collapse on the date, but when I do

collapse (mean) ItemNum (first) Date, by(ID)

it just takes the first date that the ID shows up, not the first occurrence of each unique date.

The next issue is that once it's collapsed, I need to take the mean of the count of the dates, not the date itself, which is also where I seem to be getting tripped up.

2
what are you trying to compute exactly? your post is rather confusing.ℕʘʘḆḽḘ
Your new question appears to call for another collapse.Nick Cox

2 Answers

1
votes

Or perhaps something like

clear
input ID str16 dt ItemNum
1 "01/22/2010"  1
1 "01/22/2010"  2
1 "07/19/2013"  1
end
generate Date = daily(dt,"MDY")
egen trip = tag(ID Date)
collapse (sum) trip, by(ID)
summarize trip

    Variable |        Obs        Mean    Std. Dev.       Min        Max
-------------+---------------------------------------------------------
        trip |          1           2           .          2          2

if what you are looking for is found in "Mean" - a single number giving the average number of trips made by the 2800 individuals (1 individual with the limited sample data given).

1
votes

are you trying to do the following?

collapse (mean) ItemNum, by(ID Date) fast