Let's say I have the following data:
id disease
1 0
1 1
1 0
2 0
2 1
3 0
4 0
4 0
I would like to remove the duplicate observations in Stata. For example
id disease
1 1
2 1
3 0
4 0
For group id
=1, keep observation 2
For group id
=2, keep observation 2
For group id
=3, keep observation 1 (because it has only 1 obs)
For group id
=4, keep observation 1 (or any of them but one obs)
I am trying Stata duplicates
command,
duplicates tag id if disease==0, generate(info)
drop if info==1
but it's not working as I required.