Consider some examples.
Variable is numeric but has labels attached. Observations with a label of "n.a." have been replaced with missings.
clear
input ///
x
1
1
2
3
3
end
label define lblx 1 "a" 2 "b" 3 "n.a."
label values x lblx
list
list, nolabel
foreach var of varlist _all {
replace `var' = .a if `var' == 3
}
list, nolabel
count if missing(x)
Variable is string type. Strings were replaced with other strings; the ".a"s are not really missings, as far as Stata is concerned. It's just text with no special meaning.
clear
input ///
str5 x
a
a
b
n.a.
n.a.
end
list
foreach var of varlist _all {
replace `var' = ".a" if `var' == "n.a."
}
list
The following doesn't work and maybe is your case. You have a numeric variable (again with labels) and you are asking Stata to check for string characters. Thus, there is a type mismatch.
clear
input ///
x
1
1
2
3
3
end
label define lblx 1 "a" 2 "b" 3 "n.a."
label values x lblx
list
foreach var of varlist _all {
replace `var' = ".a" if `var' == "n.a."
}
As it is, your code is addressing the variables as if they were string type (because of the quotes used in the loop). If you want to use extended missing values (.a
), then the variable must be numeric. System missings and extended missings only apply to them.
The only missing for string types is a blank (""
).
It seems you have numeric variables due to the error you report. If your variables were all string type, then you wouldn't get a type mismatch error. Nevertheless, you mention
some "n.a." , which I take to be value labels. Underlying the value label, is some numeric value. You can see them running list, nolabel
.
If that is the case, you can replace
the numeric values corresponding to value labels equal to "n.a.", with something like:
clear
*----- example data -----
input ///
x y
1 1
1 4
2 4
3 4
3 2
end
label define lblx 1 "a" 2 "b" 3 "n.a."
label values x lblx
label define lbly 1 "a" 2 "b" 4 "n.a."
label values y lbly
list
list, nolabel
*----- what you want -----
foreach var of varlist _all {
replace `var' = .a if `var' == "n.a.":`:value label `var''
}
// check
list, nolabel
count if missing(x)
count if missing(y)
Notice that for variable x
, the value label n.a.
is mapped to the value 3, while for variable y
, it is mapped to value 4. Thus, for one variable you want to replace values of 3, and for the other, values of 4. The code will take care of that automatically. Additionally, Stata now recognizes the replaced values as missings.
See help labels
, help missing
and https://stackoverflow.com/a/25942520/2077064, which explains more on how the condition within the replace
is working.
If you need to select one specific type of variables, Aspen Chen has already mentioned ds
. Other options can be found at
The Stata Journal (2010)
10, Number 2, pp. 281–296,
Speaking Stata: Finding variables,
by Nick Cox. (Freely available on the web.)
".a"
It does attach special meaning to the numeric value.a
. Although the thread is about how to do it, replacing a string value with an easily explained meaning with another that needs to be explained is not an obviously helpful step. – Nick Cox