This didn't get much of a response, for several reasons:
You didn't show very much code.
You didn't show data in a form that is especially useful. An image can't be copied and pasted easily into someone's Stata to allow experiment. In fact your image shows variables that are irrelevant and variables that are different versions of each other and so is much more complicated than we need.
You escalated the question to ask the most complicated version of what you want to know.
There is a problem you should have explained better. area
is string and so totals can't be calculated at all and area2
is just arbitrary integers so totals can be calculated but don't make sense. "nothing works" is not informative as a problem report. The only totals that make sense to me are totals of value
.
You need to simplify your problem first and then build up.
The essence seems to be as follows:
* Example generated by -dataex-. To install: ssc install dataex
clear
input str2 country str6 item float year str1 region float value
"A" "barley" 1999 "X" 1
"B" "barley" 1999 "X" 2
"C" "barley" 1999 "Y" 3
"A" "barley" 2000 "X" 4
"B" "barley" 2000 "X" 5
"C" "barley" 2000 "Y" 6
"A" "barley" 2001 "X" 7
"B" "barley" 2001 "X" 8
"C" "barley" 2001 "Y" 9
end
* means by countries: similar variables for other periods
egen mean_9901_c = mean(cond(inrange(year, 1999, 2001), value, .)), by(country item)
* aggregation to regions, but ensure that you don't double count
egen value_region = total(value), by(region item year)
egen tag = tag(region item year)
* means by regions: similar variables for other periods
egen mean_9901_r = mean(cond(tag == 1 & inrange(year, 1999, 2001), value_region, .)), by(region item)
list, sepby(year)
+---------------------------------------------------------------------------------+
| country item year region value mean_9~c value_~n tag mean_9~r |
|---------------------------------------------------------------------------------|
1. | A barley 1999 X 1 4 3 1 9 |
2. | B barley 1999 X 2 5 3 0 9 |
3. | C barley 1999 Y 3 6 3 1 6 |
|---------------------------------------------------------------------------------|
4. | A barley 2000 X 4 4 9 1 9 |
5. | B barley 2000 X 5 5 9 0 9 |
6. | C barley 2000 Y 6 6 6 1 6 |
|---------------------------------------------------------------------------------|
7. | A barley 2001 X 7 4 15 1 9 |
8. | B barley 2001 X 8 5 15 0 9 |
9. | C barley 2001 Y 9 6 9 1 6 |
+---------------------------------------------------------------------------------+
The example shows just one item, but the code should work for several.
The example shows fake data for just three years, but means for other periods can be constructed similarly.
Results are repeated for all observations to which they apply. To see or use results just once, use if
. For example the means over 1999 to 2001 are shown for each of those years (and others) but if year == 1999
would be a way to see results just once.
See also help collapse
, help egen
for its tag()
function and this paper.
What was wrong with your code
Your problems start with
if area=="Russia" & area=="Ukraine"
which selects observations for which it is true that area
is both "Russia" and "Ukraine" in the same observation, which is impossible. You need the |
(or) operator there, not the &
operator, or to approach the problem in another way.
The prefix id
is wrong too. Using by id:
enforces separate calculations for different values of id
and is going to make the combinations of identifiers impossible.