3
votes

After many years away from Stata I am currently editing code which repeatedly does something like this:

egen min = min(x)
egen max = max(x)
generate xn = (x - min) / (max - min)
drop min max

I want to reduce this code to one line. But neither of the two "natural" ways that come to my mind work.

gen x_index = (x - min(x)) / (max(x)- min(x))
egen x_index = (x - min(x)) / (max(x)- min(x))

What pieces of the Stata logic am I missing?

1

1 Answers

4
votes

The Stata functions max() and min() require two or more arguments and operate rowwise (across observations) if given a variable as any one of the arguments. Documented at e.g. help max().

The egen functions max() and min() can only be used within egen calls. They could be applied with single variables, but their use to calculate single maxima or minima is grossly inefficient unless exceptionally it is essential to store the single result in a variable. Documented except for the warnings at help egen.

Neither approach you consider will work without becoming more roundabout. Consider

su x, meanonly 
gen x_index = (x - r(min)) / (r(max)- r(min))

In some circumstances it might be more efficient to calculate the range just once:

su x, meanonly 
scalar range = r(max) - r(min) 
gen x_index = (x - r(min)) / range 

In a program it would usually be better to give the scalar a temporary name.

Within egen calls, an egen function can be called only once.