I'm trying to sum the values across multiple columns (variables) for each row and store the summed values as a new column. However, my data (a Stata file) has more than 500 variables with each columns named by some abbreviations without any identifiable prefix pattern (also, the first couple variables are names and IDs), so it is not feasible to use varlist
inside a rowtotal()
function nor using the wildcard method rowtotal(prefix*)
.
I'm wondering if there's a way to subset Stata data by the range of columns and apply rowtotal()
over tjose columns in a way like R (e.g., df[, 3:500]
) since I know the range of columns that I want to sum over. Something like this
year state aalco aata acdt acpeu
2000 usa 0 0 -1 0
2001 usa 0 0 -1 0
2002 usa 0 0 -1 0
2003 usa 0 0 -1 0
2004 usa 0 0 -1 0
2005 usa 0 0 -1 0
2006 usa 0 0 -1 0
2007 usa 0 0 -1 0
2008 usa 0 0 -1 0
2009 usa 0 0 -1 0
2010 usa 0 0 -1 0
2011 usa 0 0 -1 0
2012 usa 0 0 -1 0
I attached the link of my data here and hope that someone could give me some hints about this https://www.dropbox.com/s/fy5zpmf2tdlf3wx/dyadic_format3.dta?dl=1
I've references these posts (here, and here) but they don't quite solve my puzzle.
ds
to remove those string variables and convert the "-1" values of the rest of the numeric variables to "0", getting a bit confused withvarlist
as it still include the first couple string variables, therefore causingtype mismatch
error. – Chris T.ds
as its notional author. You can combine specifications such as numeric variables that aren't named identifiers.findname
is my second go with more functionality and I think better syntax, but I would say that. – Nick Cox