0
votes

In Stata, I have a dataset like this:

obs    v2    v3    v4    v5    v6
1      .     3     .     .     1
2      2     .     .     4     5
3      .     7     .     .     .
4      1     .     1     .     4

How can I find all of the columns that have a non "." value in them, by row?

For example, I want to find that:

obs 1 has non-empty values for v3 and v6.

obs 2 has non-empty values for v2, v5, and v6.

obs 3 has non-empty values for v3.

obs 4 has non-empty values for v2, v4, and v6.

Here is pseudocode of one way that is not efficient at all (I want to find a better, faster way):

  1. Create new variables, v2a ... v6a. v2a will take string value "v2" if there is a non-empty value in the row and 0 otherwise. Do this for all 'a' variables.
  2. Concatenate all the a variables.

I don't need a new variable per se. If it just outputted onto the screen, that would be great too.

1

1 Answers

2
votes

This code is not very elegant, but it does the job.

clear
input obs v2 v3 v4 v5 v6
1 . 3 . . 1
2 2 . . 4 5
3 . 7 . . .
4 1 . 1 . 4
end

gen strL nonmiss=""
foreach var of varlist v2-v6    {
    replace nonmiss=nonmiss+" "+"`var'" if !missing(`var')
}
list nonmiss