3
votes

I want to store a list of variables in a macro and then call that macro inside a mi() statement. The original application is for a programme that uses data I cannot bring online for secrecy reasons, and which will include the following statement:

generate u = cond(mi(`vars'),., runiform(0,1))

The issue being that mi() requires comma separated variable names but vars is delimited by spaces.

I use the auto dataset and mark to illustrate my problem:

sysuse auto
local myvars foreign price
mark missing if mi(`myvars')

In this example, mi() asks for arguments separated by commas, Stata stops and complains that it cannot find a foreignprice variable. Is there a utility function that will insert the commas between the macro elements?

1

1 Answers

3
votes

A direct answer to the question as set is to use the macro extended function subinstr to change spaces to commas:

sysuse auto
local myvars foreign price
local myvars : subinstr local myvars " " ",", all 
mark missing if mi(`myvars')

If the aim is to create a marker variable that marks observations with any values missing on specified variables, then there are other alternative ways, most of which don't need any fiddling with separators in a list. This doesn't purport to be a complete set.

A1.

 regress foreign price 
 gen missing = !e(sample) 

A2.

 egen missing = rowmiss(foreign price) 
 replace missing = missing > 0 

A3.

 local myvars foreign price 
 local myvars : subinstr local myvars " " ",", all 
 gen missing = missing(`myvars') 

A4.

 gen missing = 0 
 quietly foreach v in foreign price { 
     replace missing = 1 if missing(`v') 
 } 

A5.

 mark missing 
 markout missing foreign price 
 replace missing = !missing 

EDIT In the edited question there is reference to this within a program:

 generate u = cond(mi(`vars'),., runiform(0,1))

I wouldn't do that, even with the macro edited to include commas too, although any issue is more one of personal taste.

 marksample touse 
 markout `vars' 
 generate u = runiform(0,1) if `touse' 

It's likely that the indicator variable so produced is needed, or at least useful, somewhere else in the same program.