0
votes

I'm new to SAS and would like to get help with the question as follows:

1: Sample table shows as below

Time Color Food label
2020 red  Apple A
2019 red Orange A,B
2018 blue Apple A,B
2017 blue  Orange B  
  1. Logic to return label is:

    when color = 'red' then 'A'
    when color = 'blue' then 'B'
    when food = 'orange' then 'B'
    when food = 'apple' then 'A',
    

since for row 2, we have both red and orange then our label should contains both 'A,B', same as row 3.

The requirement is to print out the label for each combination. I know that we can use CASE WHEN statement to define how is our label should be based on color and food. Here we only have 2 kind of color and 2 different food, but what if we like 7 different color and 10 different food, then we would have 7*10 different combinations. I don't want to list all of those combinations by using case when statement.

Is there any convenient way to return the label? Thanks for any ideas!(prefer to achieve it in PROC SQL, but SAS is also welcome)

1
Do you only have two categories or does this need to be expanded to more categories and groups? Ie only food/colour as well? - Reeza
I would recommend creating formats and then using those formats to create the wanted variable. Is the label for ones iwth red and apple, A, A or A? - Reeza
Why are you labeling some of the records with one letter but some with two? Why not always have one letter for the FOOD type and another letter for the COLOR type? - Tom
Would you mind explaining a little bit more how to create formats? I think label for red and apple should be A, since both columns met the criteria for A(or do you think A,A) would be more easier to achieve? - gig.d

1 Answers

3
votes

This looks like a simple application of formats. So define a format that converts COLOR to a code letter and a second one that converts FOOD to a code letter.

proc format ;
   value color 'red'='A' 'blue'='B';
   value food 'Apple'='A' 'Orange'='B' ;
run;

Then use those to convert the actual values of COLOR and FOOD variables into the labels. Either in a data step:

data want;
   set have ;
   length label $5 ;
   label=catx(',',put(color,color.),put(food,food.));
run;

Or an SQL query:

proc sql ;
create table want as 
  select *
       , catx(',',put(color,color.),put(food,food.)) as label length=5
  from have
;
run;

You do not need to re-create the format if the data changes, only if the list of possible values changes.