1
votes

I am trying to run regressions by companyID and year, and save the coefficients for each firm-year model as new variables in a new column right besides the other columns. There is an additional wrinkle‹ I have panel data for 1990-2010 and want to run each regression using t to t-4 only (I.e., for 2001, use only 1998-2001 years of data and i.e. for 1990 then only the data of 1990 and so on). I am new to using foreach loops and I found some prior coding on the web. I have tried to adapt it to my situation but two issues: anything.....

  1. the output is staying blank

  2. I have not figured out how to use the rolling four year data periods.

Here is the code I tried. Any suggestions would be much appreciated.

use paneldata.dta // the dataset I am working in
generate coeff . //empty variable for coefficient
foreach x of local levels {
forval z = 1990/2010
{
capture reg excess_returns excess_market
replace coeff = _b[fyear] & _b[CompanyID] if e(sample) }
}

So below is a short snapshot of what the data looks like;

CompanyID Re_Rf Rm-Rf Year
10 2 2 1990 
10 3 2 1991 
15 3 2 1991 
15 4 2 1992
15 5 2 1993 
21 4 2 1990 
21 4 2 1991 
34 3 1 1990 
34 3 1 1991
34 4 1 1992
34 2 1 1993  
34 3 1 1994
34 4 1 1995
34 2 1 1996   

 

Re_Rf = excess_returns 

Rm_Rf = excess_market 

I want to run the following regression: ​​​​​​​

reg excess_returns excess_market
1
Cross-posted to Statalist at statalist.org/forums/forum/general-stata-discussion/general/…user4690969

1 Answers

1
votes

There is a good discussion on Statalist, but I think this answer may be helpful for your learning about loops and how Stata syntax work.

the code I would use is as follows:

generate coeff = . //empty variable for coefficient

// put the values of gvkey into a local macro called levels
qui levelsof CompanyID, local(levels)

foreach co of local levels {
    forval yr = 1994/2010 {

        // run the regression with the condition that year is between yr 
        // and yr-3 (which is what you write in your example)
        // and the CompanyID is the same as in the regression
        qui reg Re_Rf Rm_Rf if fyear <= `yr' & fyear >= `yr'-3 & CompanyID== `co'

        // now replace coeff equal to the coefficient on Rm_Rf with the same 
        // condiditions as above, but only for year yr
        replace coeff = _b[Rm_Rf] if fyear == `yr' & CompanyID == `co'
    }
}

This is a potentially dangerous thing to do if you do not have a balanced panel. If you are worried about this, there may be a way to deal with it using capture or changing the fyear loop to include something like:

levelsof fyear if CompanyID == `co', local(yr_level)
foreach yr of `yr_level' { ...