1
votes

I am trying to count non-missing values subject to a varying if condition. And then take the max for each month.

gen xx1=.
gen xx2=.

forvalues i = 1/12{
bys state year month: replace xx1= 1 if month==`i' & no_monthsreport>=`i'
bys state year month: replace xx2= sum(!missing(xx1))
bys state year month: egen tot_xx3 =max(xx2)
}

I have noticed that the egen command cannot be replaced. So the loop doesn't work. I was wondering whether there is a way of doing this without creating more variables.

1

1 Answers

3
votes

The immediate answer is that egen does not have a replace option, nor is there a replace-type command corresponding to egen. So you would need to drop or rename any previous result with the same variable name as that you want to use in an egen command.

In this problem, however, egen is not needed any way and the loop looks wrongly placed too. I don't understand what you want to do, but I think you want something more like

gen xx1 = .
forvalues i = 1/12 {
    replace xx1 = 1 if month == `i' & no_monthsreport >= `i'
} 
bys state year month: gen xx2 = sum(xx1)
bys state year month: gen tot_xx3 = xx2[_N] 

Notice that

  1. A framework of by: is not needed for the calculation of xx1 as nothing depends on the surrounding group of observations.

  2. The calculation of the running or cumulative sum of xx1 can be done just once.

  3. By construction xx1 is either missing or 1. Hence xx1 is not missing precisely when it is not 1. There is no need to fire up the missing() function and then negate it when you can count the 1s directly.

  4. The maximum value of a running sum of 1s is just the last value. (Missings are ignored by sum().)

Whether you want calculations to be done separately by state, year and month is up to you, but that choice is often a source of bugs.