I have a large text dataset in Stata that lists information about different research studies (data are in broad form, one row per study). Originally, I had a dataset with one variable per study (conditions_l
) that listed the disease condition studied (could be cardiovascular disease, lymphoma, etc). I went through and coded each into different categories, for example, variable code_c
represented cancer.
Example code:
gen code_c=0
replace code_c=code_c+1 if regexm(conditions_l, "leukemia")
Now, I have a dataset that has multiple condition variables per research study (instead of just conditions_l, I have conditions_l1
, conditions_l2
, etc. through conditions_l212
). I want to use a loop to execute this on all of the condition variables but haven't been successful so far. So, if any of the condition variables for a given research study contain "lymphoma" I want to replace code_c
with code_c+1
. How might I change the code to run across multiple variables?
regexm()
is a function, not a command. – Nick Cox