Grouping SAS Date on Month

Question

I currently have some data that is in a date format but the underlying information is actually still a SAS date number. Consequently when I come to count on this field I get a separate row for each of the SAS numbers and the information is not grouped on month as I want it to be.

The data I have looks like this;

data beforehave;
   input ID  $ Activity $ Origianl_Start_Date;
   datalines;
   12345 Activity1 Oct-13
   12345 Activity1 Oct-13
   12345 Activity1 Nov-16
   12345 Activity2 Nov-16
   12345 Activity2 Nov-16
   23145 Activity1 Sep-15
   23145 Activity2 Sep-15
   23145 Activity2 Sep-15
;
RUN;

However when it comes to count permutations on the 'Original_Start_Date' category I get this

data beforehave;
   input ID  $ Activity $ Origianl_Start_Date Count_of_Original_Start_Date;
   datalines;
   12345 Activity1 Oct-13 1
   12345 Activity1 Oct-13 1
   12345 Activity1 Nov-16 1
   12345 Activity2 Nov-16 1
   12345 Activity2 Nov-16 1
   23145 Activity1 Sep-15 1
   23145 Activity2 Sep-15 1
   23145 Activity2 Sep-15 1
;
RUN;

However what I want is this.

data beforehave;
   input ID  $ Activity $ Origianl_Start_Date Count_of_Original_Start_Date;
   datalines;
   12345 Activity1 Oct-13 2
   12345 Activity1 Nov-16 1
   12345 Activity2 Nov-16 2
   23145 Activity1 Sep-15 1
   23145 Activity2 Sep-15 2
;
RUN;

I had thought about taking this and turning it into a character format however it would be really useful to keep it as a date.

All I really want is to be able to group a SAS date number based upon the month.

How are you summarising the data? procedures such as freq and means will automatically group by the formatted values, however a data step will use the underlying value (unless you use the groupformat option in a by statement. — Longfish

Longfish Longfish · Accepted Answer · 2016-08-25T12:29:21

As alluded to in my comment, here are 2 ways to achieve your goal. The easiest is proc summary as this automatically groups by the formatted values. The 2nd option is a data step with the groupformat option in the by statement, this requires a proc sort beforehand.

data have;
   input ID  $ Activity $10. Original_Start_Date :date7.;
   format Original_Start_Date monyy5.;
   datalines;
   12345 Activity1 01Oct13
   12345 Activity1 02Oct13
   12345 Activity1 03Nov16
   12345 Activity2 04Nov16
   12345 Activity2 05Nov16
   23145 Activity1 06Sep15
   23145 Activity2 07Sep15
   23145 Activity2 08Sep15
;
RUN;

/* method 1 */
proc summary data=have nway;
class id activity original_start_date;
output out=want1 (drop=_type_ rename=(_freq_=Count_of_Original_Start_Date));
run;

/* method 2 */
proc sort data=have;
by id activity original_start_date;
run;

data want2;
set have;
by id activity original_start_date groupformat;
if first.original_start_date then Count_of_Original_Start_Date=0;
Count_of_Original_Start_Date+1;
if last.original_start_date then output;
run;

Grouping SAS Date on Month

2 Answers