For starter I know my problem is similar to This(which is the closest to my question I have found), but with some differences at the same time, hence my new post.
I have a database with an identifier and declarations. Declarations are constructed as identifier + a letter. If the idendifier is 123456, declarations would then be "123456A", "123456B" and so on
I would like to select one observation for each identifier, with the declaration that is the one with the last letter, which is of course, not always the same.
I assume I can do that with a proc sort and then another one with nodupkey :
proc sort data=have out=have2;
by identifier declaration /descending;
run;
proc sort data=have2 out=want nodupkey;
by declaration;
run;
but as I have a relatively important database (tens of millions observations) I would like to know the best in sense of both better suited and fastest method if it is another one. Typically, if it is possible in one step.
Thanks
identifier
do you have? – Joe