1
votes

My input dataset looks as follows:

DRUG Table
DRUG_CD PARENT_DRUG_CD
DRG_1   DRG_5
DRG_2   DRG_5
DRG_3   DRG_4
DRG_4   --
DRG_5   DRG_6
DRG_6   --

I need to build a dataset with only those DRUGS (and all the attributes of this DRUGS) who does not have a parent anymore and an extra column (CHD_DRUG_CD) which contains the lowest level of DRUG_CD.

Resulting DRUG Table
DRUG_CD CHD_DRUG_CD
DRG_6   DRG_1
DRG_6   DRG_2
DRG_6   DRG_5
DRG_4   DRG_3
1

1 Answers

2
votes

Parent-child relationships are relatively common in the literature, and you should be able to find plenty of examples out there. Here's a place to start:

http://support.sas.com/kb/25/968.html

The basic concept is that you need to iterate over the dataset to find the child for each parent. You can do that either by creating your own iterator, using POINT= (as in the paper linked) or a DOW loop, or you can do this using other techniques such as a hash table. Hash iterator is probably the fastest solution, although the most complex to understand and to code. Then when you find your child, you identify which one you found (in an array, for example, as in the paper) and return the loop to the beginning to find the next one (or, exit with none found).