I am trying to calculate the rolling standard deviation for some economic variable (let's call it X) over 10 previous years. I work with a panel data set, which contains 500 firms (id1-id500) and 20 years (1995-2015). I have to calculate the standard deviation for each year and each firm starting from the year 2006 on the rolling basis (rolling window over 10 years). I've manged to do this with the following code:
tsset id year
rolling sd_X=r(sd), window(10) keep(year) saving(sd_X, replace): sum X
merge 1:1 id year using "sd_X.dta"
This code works fine, but I have a small problem: X variable contains a lot of missing values and I need more than 3 observations for X variable to calculate standard deviations. My solution was to calculate standard deviations using -rolling- (as I showed before) and than replace all standard deviations with missing values, if there were calculated using less than 3 observations.
I tried to write a loop, which counts all non-missing values in the period of 10 previous years for each firm and year using rolling window, but unfortunately it didn't work well:
gen count1 = 0
forvalues i = 1995/2015 {
forvalue z = 1/500 {
count if year == `i' & id == `z' & X != .
replace count1 = count1[_n-1] + r(N)
}
}
As a result, I've got only missing values for count1. The results, which I want to get, has to look like this:
firm year X count1
z i
id1 1995 m.v. 0
........................
id1 2011 m.v. 0
id1 2012 1 0
id1 2013 1 1
id1 2014 21 2
id1 2015 4 3
........................
id2 1995 m.v. 0
........................
id2 2011 m.v. 0
id2 2012 m.v. 0
id2 2013 m.v. 0
id2 2014 42 1
id2 2015 46 2
and so on...
Does anybody know nice way to do this?