I am working on a research project that requires me to run a linear regression on the stock returns (of thousands of companies) against the market return for every single day between 1993 to 2014.
The data would be similar to (This is dummy data):
| Ticker | Time | Stock Return | Market Return |
|----------|----------|--------------|---------------|
| Facebook | 12:00:01 | 1% | 1.5% |
| Facebook | 12:00:02 | 1.5% | 2% |
| ... | | | |
| Apple | 12:00:01 | -0.5% | 1.5% |
| Apple | 12:00:03 | -0.3% | 2% |
The data volume is pretty huge. There are around 1.5 G of data for each day. There are 21 years of those data that I need to analyze and run regression on.
Regression formula is something similar to
Stock_Return = beta * Market_Return + alpha
where beta and alpha are two coefficients we are estimating. The coefficients are different for every company and every day.
Now, my question is, how to output the beta & alpha for each company and for each day into a data structure?
I was reading the SAS regression documentation, but it seems that the output is rather a text than a data structure.
The code from documentation:
proc reg;
model y=x;
run;
The output from the documentation:
There is no way that I can read over every beta for every company on every single day. There are tens of thousands of them.
Therefore, I was wondering if there is a way to output and extract the betas into a data structure?
I have background in OOP languages (python and java). Therefore the SAS can be really confusing sometimes ...
PROC DS2
which is the (new, under development) OOP version of the SAS language (replacingdata
steps). – Joe