0
votes

I think this is probably a fairly basic question, but I am just learning SAS so nothing is intuitive.

I am trying to find the average obserservations per day for my dataset. The date is listed as DOI in yyyy-mm-dd format. Each observation has a unique id variable ID.

I am just trying to make a descriptive report to say what the average number of observations per day was in the time period of my dataset, what the max and min number of observations per day were, but I can't find an easy solution. Here is what I tried:

proc means data=have
VAR ID; *somehow I want to indicate NUMBER of unique IDs rather than the ID itself\;
by doi;  *I do not want the average for each individual date, but the average 
observations per day\;
run;

Obviously this didn't work, as it treated each ID as a number rather than a single observation, but I am unsure how to indicate that. Thank you so much! I assume this is quite simple, but please let me know if I should provide extra clarification.

1
Why mention the ID variable at all? If there are 10 observations for the same ID value on the day should that count as only 10 observations for that day? Or only one since the are all for the same vlaue of ID?Tom

1 Answers

0
votes

Before you can calculate the AVERAGE number of observations per day you first need to calculate the NUMBER of observations per day.

proc summary data=have nway;
  class doi ;
  output out=counts;
run;

This will create a dataset that has variables DOI and _FREQ_.

Now you can calculate the MEAN (assuming that is the statistic you want to use for "average") and the MIN and MAX.

proc means data=counts mean min max;
  var _freq_;
run;

BUT if there are days in your interval with no observations then you might need to do more work. The method above will NOT include such days in the denominator when calculating the average.