0
votes

I have data like this

year month X    Y    weight  
2013    1   1    0    1000
2001    12  0     1    2000

I want to create a variable Z based on the X and Y variables, conditional on year. I have two formulas for year before and after 2002. If I use egen with if,

if year > 2002 {
   bysort year month  :egen Z= total( x*weight)
}
else {
bysort year month : egen Z= total(y*weight*0.5)
}

this code is not going to work, because if year <2002 , Stata would report that z has already been created. Is there any way to achieve the goal?

I used a very crude and brute force way to solve this problem. I create two variables for z, namely z and z_2002. Then I replace z with z_2002 if the year is less than 2002.

2
The "already created" error is not your only problem. You've incorrectly used the programming ifcmd rather than an if qualifier (help if). See stata.com/support/faqs/programming/… Only the qualifying if can be used to operate on a subset of observations.Steve Samuels
Let's start again. What exactly do you want? Is it the following: 1. There are two rules one for before 2002, and one for after. 2. Z is the yearly sum of the monthly functions of x or y dependent on the year. Is this correct?D3L
z is the monthly sum for each month and year. and the formula for z is different depending on the year.Yan Song
@YanSong It seems you have answered your own question by editing the original question. It is great that you have answered your own question. I am not sure if this still counts as a question.Francis Smart
@fsmart As I said in the question, my method is not elegant enough. I am hoping someone would have a better approachYan Song

2 Answers

0
votes
clear
input year month x    y    weight
2013    1   1    0    1000
2001    12  0     1    2000
end

preserve
keep if year>2002
bysort year month  :egen z= total(x*weight)
tempfile t1
save `t1'
restore
keep if year<=2002
bysort year month : egen z= total(y*weight*0.5)
append using `t1'
list
2
votes

If I understand correctly, this should work.

Compute the products in a first step (conditional on the year) and the sums in a second step.

As other answers already note, there's a difference between the if qualifier and the if programming command. There's a short FAQ on this: http://www.stata.com/support/faqs/programming/if-command-versus-if-qualifier/.

(I use code provided by @NickCox in a comment to another answer.)

clear all
set more off

*----- example data -----

input year month x    y    weight
2013    1   1    0    1000
2013    1   1    0    800
2013    2   0    1    1200
2013    2   1    0    1400
2001    12  1     0    1500
2001    12  0     1    2000
2001    11  1     1    4000
end

sort year month
list, sepby(year month)

*----- computations -----

gen Z = cond(year > 2002, x * weight, y * weight * 0.5)
bysort year month: egen totZ = total(Z) // already sorted so -by- should be enough

list, sepby(year month)