1
votes

I have a dataset with missing values coded "missing". How do I recode these so Stata recognizes them as missing values? When I have numeric missing values, I have been using e.g.:

  mvdecode _all, mv(99=. )

However, when I run this with a character in it, e.g.:

 mvdecode _all, mv("missing"=. )

I get the error missing is not a valid numlist.

2
Note that "string" not "character" is really the right term here.Nick Cox

2 Answers

4
votes

mvdecode is for numeric variables only: the banner in the help is "Change numeric values to missing values" (emphasis added). So the error message should make sense: the string "missing" is certainly not a numeric value, so Stata stops you there. It makes no sense to say to Stata that numeric values "missing" should be changed to system missing, as you requested.

As for what you should do, that depends on what you mean in Stata terms by coded "missing".

If you are referring to string variables with literal values "missing" which should just be replaced by the empty string "", then that would be a loop over all string variables:

  ds, has(type string)

  quietly foreach v in `r(varlist)' { 
      replace `v' = "" if `v' == "missing"
  }

If you are referring to numeric variables for which there is a value label "missing" then you need to find out the corresponding numeric value and use that in your call to mvdecode. Use label list to look up the asssociation between values and value labels.

2
votes

mvdecode works with numlists, not strings (clearly stated in help mvdecode). The missing value for strings in Stata is denoted by "".

clear
set more off

*----- example dataset -----

sysuse auto
keep make mpg
keep in 1/5

replace make = "missing" in 2

list

*----- what you want -----

ds, has(type string)

foreach var in `r(varlist)' {
    replace `var' = "" if `var' == "missing"
}

list
list if missing(make)

You can verify that Stata now recognizes one missing value for the string variable using the missing() function.