0
votes

I am trying to do a count on the number of births. the data looks this way

ID        date 
101     2016-01-01
101     2016-02-01
101     2016-02-01
102     2015-03-02
102     2016-04-01
103     2016-02-08

So now i want to create a count based on the date the output expected is this way

ID        date         count
101     2016-01-01      1
101     2016-02-01      2
101     2016-02-01      2
102     2015-03-02      1
102     2016-04-01      2
103     2016-02-08      1

I am trying to do it by first and last and also the count from proc sql but I am missing something here.

data temp; 
set temp; 

by ID DATE notsorted; 

if first.date then c=1; 
else c+1; 


if first.ID then m=1; 
else m+1; 

run;
5
Please explain your output table. I don't understand the logic behind your COUNT variable. If it is supposed to be a count by ID and date then your last two rows are wrong. - Jetzler
Hey oh yes, sorry wrongly entered, I corrected it now. - user3658367

5 Answers

2
votes

Another solution with your original approach

data x;
input id : 3. date : ddmmyy10.;
 FORMAT DATE ddmmyy10.; 
datalines;
101 01-01-2016
101 02-01-2016
101 02-01-2016
102 03-02-2015
102 04-01-2016
103 02-08-2016
;

run;

data x; 
set x; 

by ID DATE notsorted; 

if first.ID then c=0; /*reset count every time id changes*/
if first.date then c+1; /*raise count when date changes*/

run;

produces
enter image description here

0
votes

Do you absolutely require to use first?

I would use proc freq to achieve this

data have;
   infile datalines delimiter='09'x; 
   input ID $ date $10. ;
   datalines;
101 2016-01-01
101 2016-02-01
101 2016-02-01
102 2015-03-02
102 2016-04-01
103 2016-02-08
;run;

proc freq DATA=have NOPRINT;
    TABLES ID * date / OUT=want(drop=percent);
run;

creates this:

ID date count
101 2016-01-01  1
101 2016-02-01  2
102 2015-03-02  1
102 2016-04-01  1
103 2016-02-08  1
0
votes

If you want to reproduce COUNT in the datastep you will have to use the double DOW. The dataset is SET twice. First time to count rows by ID and date. Second time to output all rows.

data out;
  do _n_ = 1 by 1 until (last.date); 
    set test ;
    by ID date;
    if first.date then count = 1;
    else count + 1;
  end;
  do _n_ = 1 by 1 until (last.date); 
    set test ;
    by ID date;
    output;
  end;
run;
0
votes

You forget to add RETAIN statement in your data-step.

data temp; 
set temp; 
retain c m 0;

by ID DATE notsorted; 

if first.date then c=1; 
else c+1; 


if first.ID then m=1; 
else m+1; 

run;
0
votes

Okay, I have edited the previous code. Hopefully this will suit your needs. Just make sure your date variable is in numeric or calendar format so that you can sort your table by ID and date first.

data want;
set have;
by id date;

if first.date then count=0;
count+1;

run;