Stata: Delete all but one observations with the same value for a variable

Question

I work with a dataset that looks like this:

    name   var1  ...
1    a       1       
2    a       1  
3    a       1  
4    a       2  
5    a       2  
6    a       3  
7    a       1  
8    a       1  
9    b       1    
10   b       1  
11   b       2  
12   b       2  
13   b       3  
14   b       3  
15   b       3

My problem is that I want to drop all observations with duplicated name/var1 combinations, but only if the duplicates are adjacent (basically, I want to drop observation 2, 3, 5, 8, 10, 12, 14, 15).

My first thought was to create a while-loop and compare var1 for observation i with var1 for observation i+1 and then drop one of them if the values are equal, I just can't get it to work in Stata.

Is there an (easy) way to do this?

Nick Cox Nick Cox · Accepted Answer · 2014-07-04T17:28:45

You want to drop observations identical to the previous on two variables:

drop if name == name[_n-1] & var1 == var1[_n-1]

Note that this is a loop, just a tacit loop, as Stata executes in observation order, comparing the 2nd observation with the 1st, and so on.

Stata: Delete all but one observations with the same value for a variable

1 Answers