0
votes

Edit : I am working on a solution and have been quite successful so far. I will post it for future reference as soon as I have it completed and fully running.

I am studying stock returns and have been thinking about this problem for some time now. I am still new to SAS, which makes this particularly challenging for me.

I have the following dataset (simplified).

PERMNO DATE       RETURN   
10078  1995JAN02   0.012000
10104  1995JAN02   0.050000
10107  1995JAN02  -0.019000
10078  1995JAN03   0.002000
10104  1995JAN03   0.004000
10107  1995JAN03  -0.009000
...    
10078  1996JAN02  -0.011000  
10104  1996JAN02  -0.020000
10107  1996JAN02   0.010000 
...
10078  2013JAN03   0.050000
10104  2013JAN03   0.012000  
10107  2013JAN03  -0.034000 

PERMNO identifies a stock, DATE identifies a date (yyyymmmdd), RETURN is the daily stock return.

I have simplified this example for only three stocks (10078, 10104, 10107).

My goal is to have daily correlation matrices between the three different stocks with a one year observation window (252 past observations, roughly) outputted to daily files.

Given my dataset, this should start in 1996JAN03 (roughly). So, for the first day it should calculate the correlation coefficient (so PROC CORR) between all the observations between 1995JAN02 and 1996JAN02 of all the different PERMNO. i.e. Then I should be able to output the following matrix

      10078  10104  10107
10078     1    0.1    0.3
10104   0.1      1    0.2
10107   0.3    0.2      1

And for the next day, 1996JAN04, I should be able to output the same matrix using the observations between 1995JAN03 and 1996JAN03. It is important that the final procedure is able to handle non-matching data sets in DATE. i.e. If one stock only has prices after 1995JAN05, then it should only use those dates, and if one of the stocks does not have a record for 1995SET05, it should be able to ignore that observations and use only the remaining "matching" ones.

My question is quite similar to this one Calculating rolling correlations in SAS but my data is not organized as a matrix. I am quite tempted to apply the suggested approach (still stuyding it) but I would have to "convert" my data into a matrix - if that would be the most efficient way, I am willing to do it (but I would need help as I really do not know how to, although I can describe the intended output).

(Feel free to skip the following part if you have a solution) What I have thought to tackle this problem (but have not yet translated to SAS code) is (assuming n stocks):

  1. Count number of unique PERMNO (possible through PROC SQL);
  2. Define a starting date (1996JAN02);
  3. Create a macro that does the following steps (4-8):
  4. For each PERMNO, obtain an list of dates for that PERMNO after the starting date (possible through PROC SQL);
  5. For each DATE in step 4, iterate through the remaining stocks (since the output is a triangular matrix, I could go to the n - 1 - j remaining stocks, where j is the current stock from 1 to n) and create a table with return1 and return2 with matching dates between the (current date - 1) and (current date - 253 observations);
  6. Calculate PROC CORR for the table created in step 5.
  7. Save this output (somehow) to a dataset of that specific day and keep doing this until I have exhausted all days for this Stock after the starting date;
  8. Go on to the next PERMNO.
  9. Write the data sets through PROC EXPORT.

However, this is very much MATLAB-y or VBA-y, which are the languages I am familiar with, and certainly not the most efficient approach in SAS.

I thank you in advance for your help.

1
Have you licensed SAS/ETS? This sounds very much like it would be better off analyzing using ETS procedures (ETS = Time Series).Joe
I am unaware of what ETS is, I use SAS remotely on WRDS which is an academic database. Based on a quick search, I believe I can only "access" SAS.sasstudent
ETS is a SAS module. Try running a PROC EXPAND sample, like at this link and see if it works. Or run PROC SETINIT; RUN; and see if it lists SAS/ETS.Joe
I will try this as soon as I soon as I get back to my computer and report back. Thank you.sasstudent
Good news, I do have ETS! Could you help me further? I have never used it before but I'll search for documentation now.sasstudent

1 Answers

0
votes

Thanks to the pointer of ETS by @Joe and this paper on calculating correlations with PROC EXPAND, I was able not to reach the initial objective but something quite close to it - I can output a data set as follows (some fields taken out for simplification):

     date  permno1  permno2  ret1  ret2   rho
1996JAN02    10078    10104  0.01  0.02  0.34

To learn more about PROC EXPAND, I suggest the following paper.

So first I get a list of all existing permno in the table, then I iterate through them using a macro. Then I extract all the returns from the initial table from two permnos into a two tables (company1 and company2) and do a full join of these. For the dates where neither company has a return the record is deleted. For the dates where only one of the stocks was traded is set to missing for both stocks because that observation is useless to compute the correlation between the two stocks in any day. Then I apply the correlation formula through the PROC EXPAND, by calculating several new fields with the necessary computations (will update this soon with a better explanation). Having all the necessary fields, I compute rho, merge the resulting data set into an all-encompasing data set and loop to the next company.

This code takes roughly 2 minutes per company, given that there are ~970 companies it should take a day to fully run - but it does its job! If you have any comments on this code, I will be more than glad to hear them.

proc sql noprint;
    create table permnolist as
    select distinct permno
    from pricescorrel;
quit;

data permnolist;
    set permnolist;
    by permno;
    permnoid + 1;
run;

%MACRO correlpairs();

    proc sql noprint;
        select distinct max(permnoid)
        into :ttl_id
        from permnolist;
    quit;

    %do ii=1 %to &ttl_id;

        proc sql noprint;
            select distinct permno
            into :permnoa
            from permnolist
            where permnoid = ⅈ
        quit;

        %do jj=&ii %to &ttl_id;

            proc sql noprint;
                select distinct permno
                into :permnob
                from permnolist
                where permnoid = &jj;
            quit;

            proc sql noprint;
                create table company1 as
                select date as olddate, *
                from pricescorrel 
                where permno = &permnoa;
            quit;

            proc sql noprint;
                create table company2 as
                select date as olddate, *
                from pricescorrel 
                where permno = &permnob;
            quit;

            proc sql noprint;
                create table merged as
                select coalesce(a.olddate, b.olddate) as date,
                        a.permno as permno1,
                        b.permno as permno2,
                        a.ret as ret1,
                        b.ret as ret2
                from company1 as a
                    full join company2 as b
                    on a.olddate = b.olddate;
            quit;

            data merged;
                set merged;
                OBS = 1;
                if nmiss(of ret1 -- ret2) > 1 then
                    delete;
                else
                    if nmiss(of ret1 -- ret2) > 0 then
                        do;
                            ret1 = . ;
                            ret2 = . ;
                            OBS = . ;
                        end;
                R12 = ret1 * ret2;
            run;

            proc expand DATA = merged OUT = mergedout METHOD = none;
                convert ret1 = C1SUM / TRANSFORMOUT = (movsum 252);
                convert ret2 = C2SUM / TRANSFORMOUT = (movsum 252);
                convert R12 = C12SUM / TRANSFORMOUT = (movsum 252);
                convert OBS = N / TRANSFORMOUT = (movsum 252);
                convert ret1 = C1CSS / TRANSFORMOUT = (movcss 252);
                convert ret2 = C2CSS / TRANSFORMOUT = (movcss 252);
            run;

            data mergedout(keep=date obs N permno1 permno2 ret1 ret2 rho);
                set mergedout;
                if missing(rho) then
                    delete;
                rho = (C12SUM - (C1SUM * C2SUM)/N) / (sqrt(C1CSS) * sqrt(C2CSS));
                rho = lag(rho);
            run;

            proc datasets nolist;
                append base=mydat
                data=mergedout;
            quit;

            proc datasets library = work nolist;
                delete mergedout;
            run;
        %end;
    %end;
proc export
    data=mydat 
    outfile="mergedout.csv"  
    dbms=csv replace;
run;
%mend;
%correlpairs();