I have a dataset that is composed of research studies. Within some of the studies are multiple data points (DP). My data is structured so that each row is a separate data point. Additionally, I have a separate variable that denotes the specific research article.
I need to obtain summary statistics from the data relative to the research studies (not DPs). In other words, I need for every row to become research studies with the DPs becoming counts.
I have tried the code below using contract
. It works for the list
command. However, I need summary statistics as well as I'd like to get summaries for multiple variables and combine them into one table once the data is organized.
contract study nation
drop _freq study
contract nation
list
EXAMPLE:
Raw Data:
Study | DP | Year | Nation |
---|---|---|---|
1 | 1 | 2005 | Brazil |
1 | 2 | 2005 | Brazil |
1 | 3 | 2005 | Brazil |
1 | 4 | 2005 | France |
2 | 5 | 2006 | Brazil |
2 | 6 | 2006 | Italy |
3 | 7 | 2010 | Brazil |
3 | 8 | 2010 | Canada |
4 | 9 | 2011 | Canada |
5 | 10 | 2015 | Brazil |
6 | 11 | 2015 | Canada |
What I need:
Year | f (of studies) |
---|---|
2005 | 1 |
2006 | 1 |
2010 | 1 |
2011 | 1 |
2015 | 2 |
And I also need a histogram of the above table.
Nation | f (of studies) |
---|---|
Brazil | 4 |
Canada | 3 |
France | 1 |
Italy | 1 |
I have more variables that will need this. And they will need more than frequencies (e.g. mean, sd, var). So whatever solution is given needs to work for summarizing variables as well.