Stata - Reshaping data from long to wide for analysis

Question

I have data in the following form

firm    month_year    sales    competitor   competitor_location      competitor_branch_1       competitor_branch_2
  1       1_2014    25          XYZ             US                      EEE                       RRR
  1       2_2014    21          XYZ             US                      FFF
  1       2_2014    21          ABC             UK                      GGG
 ...
  21     1_2009    11          LKS            UK                       AAA
  21     1_2009    11          AIS            UK                       BBB
  21     1_2009    11          AJS            US                       CCC
  21     2_2009    12          LKS            UK                       AAA

I still want an entry for every firm at the month_year level but do not want separate rows for other variables, just columns. I am trying to turn it into this format.

firm    month_year    sales    competitor_1   competitor_2     competitor_3        competitor_1_location     competitor_2_location     competitor_3_location            competitor_1_branch_1        competitor_2_branch_1           competitor_3_branch_1           competitor_1_branch_2        competitor_2_branch_2           competitor_3_branch_2                  competitor_1_branch_3        competitor_2_branch_3           competitor_3_branch_3

I thought reshape wide sales competitor competitor_location competitor_branch_1 competitor_branch_2, i(firm) j(month_year)

As you have it, it's not possible. Your j variable must be such that it uniquely identifies observations within firm groups. Firm 21, for example, has several observations with the same date, so the reshape won't work. Your example implies that you're trying to fill in one "cell" of the data matrix with more than one observation. Stata won't accept that. — Roberto Ferrer
So, do the exercise of "manually" doing the reshape and you will see the difficulty. You've put no values in the desired format. The exercise is precisely about that. Simpler, you can do the exercise with the very small database found in help reshape. — Roberto Ferrer
@RobertoFerrer Thank you. I have successfully done reshape with easier examples. What would you suggest for this case? — CJ12
The problem is that the reshape you say you want to do is impossible. You could make it so you have only one observation per firm (not firm date). But the layout of your data, and its usefulness, is something you have to determine according to the goal. I'll post an example with what I've mentioned. Usually, I'd go with data in long form, unless the estimation method forces me to go the other way. — Roberto Ferrer

Roberto Ferrer Roberto Ferrer · Accepted Answer · 2014-08-06T01:48:25

Most of the code is just setting up the example data (however inefficient that may be). The encodes I believe are not necessary, but recommended.

The code gives only one observation per firm (as stated in my comment).

clear all
set more off

*----- example data -----

input ///
firm    str7 month_year    sales    str3 competitor   str3 competitor_location  str3 competitor_branch_1       str3 competitor_branch_2
  1       "1_2014"    25          "XYZ"            "US"                     "EEE"                      "RRR"
  1       "2_2014"    21          "XYZ"            "US"                    "FFF"
  1       "2_2014"    21          "ABC"            "UK"                     "GGG"
  21     "1_2009"    11          "LKS"           "UK"                      "AAA"
  21     "1_2009"    11          "AIS"            "UK"                      "BBB"
  21     "1_2009"    11          "AJS"            "US"                      "CCC"
  21     "2_2009"    12          "LKS"            "UK"                      "AAA"
end

encode competitor, gen(comp)
encode competitor_location, gen(comploc)
encode competitor_branch_1, gen(compbr1)
encode competitor_branch_1, gen(compbr2)

gen date = ym( real(substr(month_year,3,.)), real(substr(month_year,1,1)) )
format date %tm

drop competitor* month*

list

*----- what you want ?? -----

bysort firm: gen j = _n // this sorting is not unique

reshape wide date sales comp comploc compbr1 compbr2, i(firm) j(j)

Stata - Reshaping data from long to wide for analysis

1 Answers