1
votes

I need help creating this age group variable. In my data age is measured to 9 decimal places. I can decide the categories I just picked the quartiles. But I keep getting these errors...

"ERROR 388-185: Expecting an arithmetic operator. ERROR 200-322: The symbol is not recognized and will be ignored."

I have tried rounding and changing the le to <= but it still gives the same error... :(

data sta310.hw4;
   set sta310.gbcshort;
   age_cat=.;
   if age le 41.950498302 then age_cat = 1;
   if age > 41.950498302 and le 49.764538386 then age_cat=2;
   if age > 49.764538386 and le 56.696966378 then age_cat=3;
   if age > 56.696966378 then age_cat=4;
run;
4

4 Answers

0
votes

this things are better of using proc format. You are missing your variable name after your and arthimetic operator. also you do not need age_cat = . in the beginning. please add your age variable after and before your arthimetic operator as shown below

 data sta310.hw4;
 set sta310.gbcshort;
 age_cat=.;
  if age le 41.950498302 then age_cat = 1;
  if age > 41.950498302 and age le 49.764538386 then age_cat=2;
  if age > 49.764538386 and age le 56.696966378 then age_cat=3;
   if age > 56.696966378 then age_cat=4;
 run;
0
votes

The and le or and <= syntax is incorrect. Such a syntax might be something out of COBOL.

Try this form of a SAS Expression

  • value < variable <= value

Example

data sta310.hw4;
   set sta310.gbcshort;
   age_cat=.;
   if age <= 41.950498302 then age_cat = 1;
   if 41.950498302 < age <= 49.764538386 then age_cat=2;
   if 49.764538386 < age <= 56.696966378 then age_cat=3;
   if 56.696966378 < age then age_cat=4;
run;    

A similar and safer sieve of logic can be accomplished using a select statement.

  select;
    when (age <= 41.950498302) age_cat=1;
    when (age <= 49.764538386) age_cat=2;
    when (age <= 56.696966378) age_cat=3;
    otherwise age_cat=4; 
  end;

The SAS select is different than C switch statement in that an affirming when statement flows past the select (and does not require a break as is often seen in switch/case)

0
votes

The problem was in your if statements with multiple conditions. Also, because the age_cat is not a numeric variable (i.e you do not want to sum up this variable), I would put it as a character var of length 1, specifying it in an format statement upfront (best practice in SAS data management). Finally, I would also suggest reformulating your if else construct as to make it more memory efficient:

data sta310.hw4;
   set sta310.gbcshort;
   format age_cat $1.; 
   if age <= 41.950498302 then age_cat = "1";
   else if 41.950498302 < age <= 49.764538386 then age_cat= "2";
   else if 49.764538386 < age <= 56.696966378 then age_cat="3";
   else age_cat="4";
run;

Hope this helps,

0
votes

If you're grouping with quartiles avoid the hard coding and use PROC RANK with GROUPS=4. The groups will be 0 to 3 but same idea.

   proc rank data=sta310.gbcshort out=sta310.hw4 groups=4;
   var age;
   rank age_cat;
   run;

In your current program, this line/logic is your issue:

if age > 41.950498302 and le 49.764538386 then age_cat=2;

It should be:

 if 41.950498302 < age <= 49.764538386 then age_cat=2;

You should also switch those to IF/ELSE IF rather than IF statements. You should do this because once it finds the category it stops evaluating the conditions so it's not checking each IF condition which makes it slightly faster. This isn't something you'll notice in your homework but if you ever work on larger data sets this is really important to know.

if age <= 41.950498302 then age_cat = 1;
else if 41.950498302 < age <= 49.764538386 then age_cat=2;
else if 49.764538386 < age <= 56.696966378 then age_cat=3;
else if 56.696966378 < age then age_cat=4;