Reshaping Data in Stata

Question

I have a dataset that is composed of research studies. Within some of the studies are multiple data points (DP). My data is structured so that each row is a separate data point. Additionally, I have a separate variable that denotes the specific research article.

I need to obtain summary statistics from the data relative to the research studies (not DPs). In other words, I need for every row to become research studies with the DPs becoming counts.

I have tried the code below using contract. It works for the list command. However, I need summary statistics as well as I'd like to get summaries for multiple variables and combine them into one table once the data is organized.

contract study nation
drop _freq study
contract nation
list

EXAMPLE:

Raw Data:

Study	DP	Year	Nation
1	1	2005	Brazil
1	2	2005	Brazil
1	3	2005	Brazil
1	4	2005	France
2	5	2006	Brazil
2	6	2006	Italy
3	7	2010	Brazil
3	8	2010	Canada
4	9	2011	Canada
5	10	2015	Brazil
6	11	2015	Canada

What I need:

Year	f (of studies)
2005	1
2006	1
2010	1
2011	1
2015	2

And I also need a histogram of the above table.

Nation	f (of studies)
Brazil	4
Canada	3
France	1
Italy	1

I have more variables that will need this. And they will need more than frequencies (e.g. mean, sd, var). So whatever solution is given needs to work for summarizing variables as well.

Nick Cox Nick Cox · Accepted Answer · 2020-12-29T10:00:48

egen will help with summary statistics and graphs. Its tag() function will let you tag each country just once.

Note here that dataex in Stata is a better way to give a code example, as explained in the Statalist FAQ and here at the Stata tag.

* Example generated by -dataex-. To install: ssc install dataex
clear
input byte(Study DP) int Year str6 Nation
1  1 2005 "Brazil"
1  2 2005 "Brazil"
1  3 2005 "Brazil"
1  4 2005 "France"
2  5 2006 "Brazil"
2  6 2006 "Italy" 
3  7 2010 "Brazil"
3  8 2010 "Canada"
4  9 2011 "Canada"
5 10 2015 "Brazil"
6 11 2015 "Canada"
end

egen tag = tag(Nation)

egen count = count(DP) , by(Nation)

histogram count if tag, discrete freq width(1) xla(1/6)

Reshaping Data in Stata

1 Answers