0
votes

I am working with a data set covering multiple countries, variables, and years. It is currently organized wide like so (actually ~30 years and 5 different variables for each country):

country measure yr1995 yr1996 yr1997
USA A 5 4 1
USA B 1 2 1
USA C 0 4 2
UK A 2 4 9
UK B 2 8 4
UK C 2 4 1

What I would like is for the data to be rearranged long like so:

country year A B C
USA 1995 5 1 0
USA 1996 4 2 4
USA 1997 1 1 2
UK 1995 2 2 2
UK 1996 4 8 4
UK 1997 9 4 1

I tried using reshape long yr, i(country) j(year) but get the following error message:

variable id does not uniquely identify the observations
    Your data are currently wide.  You are performing a reshape long.  You specified i(country) and j(year).  In
    the current wide form, variable country should uniquely identify the observations.

I think this is because country is not the only long variable? (measure also is?)

Besides fixing that issue and arranging the years long instead of wide, I don't think this command will accomplish the other task of moving the different variables (A, B, C) into the wide format as column headers.

Will I need to use a separate reshape wide command for that? Or is there some way to expand the command to do both at once?

1

1 Answers

0
votes

It's a double reshape. At least it can be done that way; and, further, that seems essential because years need to be long, not wide, and the measure(s) need to be wide, not long, so there are flavours of both problems.

Economic development data often arrive like this. Indeed the problem has given rise to at least one dedicated short paper in the Stata Journal, but visible to all.

Your data example is helpful, and almost immediately useful, but please read the Stata tag and help dataex (if necessary, install dataex first using ssc install dataex).

See also this FAQ, which includes some hints beyond the Stata help and manual entry.

A search reshape in Stata would have pointed to these resources.

clear
input str3 country str1 measure yr1995 yr1996 yr1997
USA A 5 4 1
USA B 1 2 1
USA C 0 4 2
UK A 2 4 9
UK B 2 8 4
UK C 2 4 1
end 

reshape long yr, i(country measure) j(year) 

reshape wide yr, i(country year) j(measure) string 

rename (yr*) * 

list, sepby(country)

     +----------------------------+
     | country   year   A   B   C |
     |----------------------------|
  1. |      UK   1995   2   2   2 |
  2. |      UK   1996   4   8   4 |
  3. |      UK   1997   9   4   1 |
     |----------------------------|
  4. |     USA   1995   5   1   0 |
  5. |     USA   1996   4   2   4 |
  6. |     USA   1997   1   1   2 |
     +----------------------------+