0
votes

I am trying to update some missing values in a dataset with values from another.

Here is an example in Stata 14.2:

sysuse auto, clear   

// save in order to merge below
save auto, replace 

// create some missing to update
replace length = . if length < 175

// just so the two datasets are not exactly the same, which is my real example
drop if _n == _N

merge 1:1 make using auto, nogen keep(master match_update) update

The code above only keeps the observations updated (26 observations). It is exactly the same result if one uses keep(match_update) instead.

Why is Stata not keeping all observations in the master dataset?

Note that not using match_update is not helpful either, as it removes all observations.

My current workaround is to rename original variables, merge all, and then replace if original was missing. However, this defeats the point of using the update option, and it is cumbersome for updating many variables.

1
Get rid of keep(master match_update) in the last line and it will work. - user8682794
@PearlySpencer Oh, I see. Thanks. What's the logic of the change? Why doesn't my command work? - luchonacho
With this option you are asking merge to only keep the updated observations in the dataset that match the other and drop everything else. - user8682794
But I'm also telling Stata to keep the master observations. Otherwise, what is the meaning of master inside the keep option? - luchonacho
I just had a better look at it and it seems that keep(master match match_update) does what you want. Personally I always prefer to manually drop / keep observations using _merge as it is more transparent and less error prone. - user8682794

1 Answers

3
votes

Personally, I always prefer to manually drop / keep observations using the _merge variable as it is more transparent and less error prone.

However, the following does what you want:

merge 1:1 make using auto, nogenerate keep(master match match_update) update

Result                           # of obs.
-----------------------------------------
not matched                             0

matched                                73
    not updated                        47  
    missing updated                    26  
    nonmissing conflict                 0  
-----------------------------------------

You can confirm that this is the case as follows:

sysuse auto, clear   
save auto, replace

replace length = . if length < 175
drop if _n == _N

merge 1:1 make using auto, update 

drop if _merge == 2
drop _merge
save m1

sysuse auto, clear   
save auto, replace

replace length = . if length < 175 
drop if _n == _N 

merge 1:1 make using auto, nogen keep(master match match_update) update 
save m2

cf _all using m1

display r(Nsum)
0