0
votes

I have a panel data set in a 20 year period, in which several companies have different financial output (e.g. sales, costs). I have over 100,000 observations.

I now want to eliminate firms which only have 2 or less observations in the data set (i.e. for example firm A has output only in 2000, but in no other year).

I used:

by fyear: tabulate companyid

I can see firms with less than 3 observations, but how is it possible to automatically drop all of those with less than 3 observations?

1

1 Answers

1
votes
by companyid (fyear), sort: drop if _N<3

This will leave your remaining data sorted by companyid and fyear, so if you really want it sorted by fyear, you will need to follow this with

sort fyear companyid