1
votes

Suppose I have a dataset of variables with the following names (note the stub of x and hm):

x9, xdog, x_99, hma8j, hm40, hm0

I want to develop a programmatic way to provide a list of variable names (which may contain wildcards) and then loop through each variable name to recode all values less than 0 with a missing value (.).

In practice I have many columns and only want to recode some of them. I do not want to use column index or ranges because I do not know them, since my data are large.

My approach involves the following steps:

  1. Create a local macro named myvars containing the variable names with wildcards

    local myvars x* hm*
    
  2. Expand the strings in the variable list to contain the full variable name strings (this should produce the original variable names):

    syntax 'myvars'
    
  3. Loop through list of variable names to set values to missing:

    foreach x of local 'myvars' { 
        replace 'x' = . if 'x' < 0
    }
    

However, I can't figure out how to include the wildcards in the for loop. The above code does not work and produces invalid syntax errors.

I found the following threads on Statalist useful, but they do not provide a solution and the use of stubs does not seem efficient:

Can anyone help me?

2

2 Answers

3
votes
foreach x of varlist x* h* {
   replace `x'= . if `x' < 0
}

from here:

http://www.cpc.unc.edu/research/tools/data_analysis/statatutorial/labor_saving/loops

2
votes

@timat's answer gives a good basic solution, but does not explain what you are doing wrong.

It appears that you are getting confused on several levels:

How to reference local macros

Use left and right single quotes, not repeated (right) single quotes:

. local foo = 42

. di `foo'
42

How best to unpack a wildcard variable list

syntax will do this, but as foreach will do it directly, syntax is redundant for your problem. But even then your syntax example is quite wrong in several ways. As its use is unnecessary, I won't expand on that.

The difference between a macro name and its contents

foreach x of local `myvars' {

(note the corrected punctuation) is almost never what you need. It will usually be

foreach x of local myvars {

Column thinking

Stata is not a spreadsheet program. Columns can be your private word, and no harm done, but column indexes are not supported directly.

How to find answers

You are (I guess) Googling for answers, not trying to read the Stata documentation. There's a lot of the latter and it's hard for beginners to know where to look, but basic help on foreach and associated explanations are more valid than the posts you cite. They are both good (it turns out I wrote both...) but some distance from your problem and it's hardly surprising that you did not find an answer to your question in either. If you want to master basic Stata, there's no real substitute for reading at least the first half of the User's Guide.