Drop all cases with four or more consecutively missing observations

Question

I am trimming the dataset in Stata, and I want to drop all the firms with 4 or more consecutive missing observations in total assets variable. How could I do that?

The data look like this:

I would like to drop all observations of b, even if there is total assets value for b in year 2000.

Data examples shown as images are helpful but not nearly as helpful as data that can be copied and pasted. Stata advice that carries over to SO is given at statalist.org/forums/help#stata — Nick Cox

Unknown Unknown · Accepted Answer · 2017-11-24T00:28:17

Added in final edit:

The answer from Nick Cox is correct; this one should be disregarded. The first set of code is not relevant to the restated version of the problem that now appears in the initial post, and the second set suffers from the error pointed out by Nick Cox in his comment below.

===============================

Well, let's assume (since you haven't really described your data) that you have a variable ta that reports total assets, and is sometimes missing, and a variable firmID that inditifies each firm, and is never missing. Then

bysort firmID: egen num_miss = total(missing(ta))
drop if num_miss >=4

might do what you want. The function missing(ta) will be 1 if ta is missing and 0 otherwise, and num_miss will contain the count of how many observations of the current firmID have a missing ta.

Added in response to Nick Cox's comment above:

If we additionally assume that you have a variable year that defines the order of "consecutive" observations, and you want to drop all firms that have a run of 4 or more consecutive observations with missing values, then the following might do what you want. Or it might not - I didn't test it on the sample data you didn't provide.

bysort firmID (year): egen num_run = total(_n>=4 & missing(ta[_n-3]) & missing(ta[_n-2]) & missing(ta[_n-1]) & missing(ta[_n]))
drop if num_run>0

Drop all cases with four or more consecutively missing observations

2 Answers