0
votes

a quick question, I have data of the following sort:

Ticker _ Date _ Fem Analyst (dummy 1 if true) ___ Variables of that month like beta

AA _ 01/04/2001 _ 1 ___ 0.61

AA _ 05/04/2001 _ 1 ___ 0.62

AA _ 08/04/2001 _ 1 ___ 0.63

AA _ 01/05/2002 _ 1 ___ 0.7

AA _ 04/05/2002 _ 1 ___ 0.71

AA _ 08/07/2002 _ 0 ___ 0.8

AA _ 07/04/2003 _ 1 ___ 0.4

and so on.. What I want to receive is the following:

Ticker _ Date Number of fem analyst Number of Male Analysts _ Total ___Variables

AA _ 04/2001 3 0 _ 3 ___ 0.63

AA _ 05/2002 2 0 _ 2 ___ 0.71

AA _ 07/2002 0 1 _ 1 ___ 0.8

AA _ 04/2003 1 0 _ 1 ___ 0.4

So a counting algorithm that allows me to count the number of female and male analyst for a certain company per month( using dummy variable gender 0 or 1) and deletes all observations for that month except the most recent one (for instance for 08/04/01 this becomes 04/01 with 0.63 which is the most recent observation for beta for 04/01 for company AA) The example explains it all I guess?

Any ideas?

1
My bad, didn't really thought of that. Went through my questions and checked every valid answer!John

1 Answers

1
votes

You may want something like this:

/* Create the month variable into a string YYYY/MM */
data analysts0;
    set <your data>;
    format month $7.;
    month=cats(year(date),'/',put(month(date),z2.));
run;
/* Sort so you can do the by processing required for counting */
proc sort data=analyst0 out=analyst1;
    /* You need to include the date in the sort so the most recent is last */
    by ticker month date;
run;
/* Count */
data count;
    retain n_fem n_male 0;
    set analyst1;
    by ticker month;
    if first.ticker of first.month then do;
        n_fem=0;
        n_male=0;
    end;
    else do;
       if gender=1 then n_fem+1;
       else if gender=0 then n_male+1;
       else put 'Huh?';
    end;
    /* this outputs only the values you need.*/
    if last.ticker or last.month then output;
run;

This should give you the general idea - I don't have access to SAS right now so I can't check the code. See the documentation for retain and by processing in the data step for more details.