3
votes

I have data with IDs which may or may not have all values present. I want to delete ONLY the observations with no data in them; if there are observations with even one value, I want to retain them. Eg, if my data set is:

ID val1 val2 val3 val4
1 23 . 24 75
2 . . . .
3 45 45 70 9

I want to drop only ID 2 as it is the only one with no data -- just an ID.

I have tried Statalist and Google but couldn't find anything relevant.

3
A simple way is to use drop if missing(val1-val4)Aspen Chen
Sorry Aspen. My example could have been clearer -- the variable names are not linear.user2830684
missing() returns 1 if ANY of the arguments evaluates to missing.dimitriy
For completeness, note that "rows" and "records" are not Stata-speak; the Stata term is "observations".Nick Cox
Thanks. I edited my question to fix this.user2830684

3 Answers

7
votes

This will also work with strings as long as they are empty:

ds id*, not
egen num_nonmiss = rownonmiss(`r(varlist)'), strok
drop if num_nonmiss == 0

This gets a list of variables that are not the id and drops any observations that only have the id.

4
votes

Brian Albert Monroe is quite correct that anyone using dropmiss (SJ) needs to install it first. As there is interest in varying ways of solving this problem, I will add another.

 foreach v of var val* { 
     qui count if missing(`v') 
     if r(N) == _N local todrop `todrop' `v' 
 }
 if "`todrop'" != "" drop `todrop' 

Although it should be a comment under Brian's answer, I will add here a comment here as (a) this format is more suited for showing code (b) the comment follows from my code above. I agree that unab is a useful command and have often commended it in public. Here, however, it is unnecessary as Brian's loops could easily start something like

 foreach v of var * { 

UPDATE September 2015: See http://www.statalist.org/forums/forum/general-stata-discussion/general/1308777-missings-now-available-from-ssc-new-program-for-managing-missings for information on missings, considered by the author of both to be an improvement on dropmiss. The syntax to drop observations if and only if all values are missing is missings dropobs.

0
votes

Just another way to do it which helps you discover how flexible local macros are without installing anything extra to Stata. I rarely see code using locals storing commands or logical conditions, though it is often very useful.

    // Loop through all variables to build a useful local
    foreach vname of varlist _all {    

            // We don't want to include ID in our drop condition, so don't execute the remaining code if our loop is currently on ID
            if "`vname'" == "ID" continue  

            // This local stores all the variable names except 'ID' and a logical condition that checks if it is missing
            local dropper "`dropper' `vname' ==. &"     
    }

    // Let's see all the observations which have missing data for all variables except for ID
    // The '1==1' bit is a condition to deal with the last '&' in the `dropper' local, it is of course true.

    list if `dropper' 1==1

    // Now let's drop those variables
    drop if `dropper' 1==1

    // Now check they're all gone
    list if `dropper' 1==1

    // They are.

Now dropmiss may be convenient once you've downloaded and installed it, but if you are writing a do file to be used by someone else, unless they also have dropmiss installed, your code won't work on their machine.

With this approach, if you remove the lines of comments and the two unnecessary list commands, this is a fairly sparse 5 lines of code which will run with Stata out of the box.