I have a dataset analogous to the simplified table below (let's call it "DS_have"):
SurveyID Participant FavoriteColor FavoriteFood SurveyMonth
S101 G92 Blue Pizza Jan
S102 B34 Blue Cake Feb
S103 Z28 Green Cake Feb
S104 V11 Red Cake Feb
S105 P03 Yellow Pizza Mar
S106 A71 Red Pizza Mar
S107 C48 Green Cake Mar
S108 G92 Blue Cake Apr
...
I'd like to create a set of numeric variables that identify the discrete categories/levels of each variable in the dataset above. The result should look like the following dataset ("DS_want"):
SurveyID Participant FavoriteColor FavoriteFood SurveyMonth ColorLevels FoodLevels ParticipantLevels MonthLevels
S101 G92 Blue Pizza Jan 1 1 1 1
S102 B34 Blue Cake Feb 1 2 2 2
S103 Z28 Green Cake Feb 2 2 3 2
S104 V11 Red Cake Feb 3 2 4 2
S105 P03 Yellow Pizza Mar 4 1 5 3
S106 A71 Red Pizza Mar 3 1 6 3
S107 C48 Green Cake Mar 2 2 7 3
S108 G92 Blue Cake Apr 1 1 1 4
...
Essentially, I want to know what syntax I should use to generate unique numerical values for each "level" or category of variables in the DS_Have dataset. Note that I cannot use conditional if/then statements to create the values in the ":Levels" variables for each category, as the number of levels for some variables is in the thousands.
PROC IML
edit with that tag as that may be easier than base SAS... – Joe