0
votes

I am a SAS novice. I am trying to convert character variables to numeric. The code below works for one variable, but I need to convert more than 50 variables, hopefully simultaneously. Would an array solve this problem? If so, how would I write the syntax?

DATA conversion_subset;
SET have;
new_var = input(oldvar,4.); 
drop oldvar; 
rename newvar=oldvar;
RUN;

@Reeza

DATA conversion_subset;
SET have;

Array old_var(*) $ a_20040102--a_20040303 a_302000--a_302202;
* The first list contains 8 variables. The second list contains 7 variables;
Array new_var(15) var1-var15;

Do i=1 to dim(old_var);
 new_var(i) = input(old_var(i),4.); 
End;

*drop a_20040102--a_20040303 a_302000--a_302202; 
*rename var1-var15 = a_20040102--a_20040303 a_302000--a_302202;
 RUN;  

NOTE: Invalid argument to function INPUT at line 64 column 19 (new_var(i) = input(old_var(i),4.)

@Reeza

I am still stuck on this array. Your help would be greatly appreciated. My code:

DATA conversion_subset (DROP= _20040101 _20040201 _20040301);
SET replace_nulls;
Array _char(*) $ _200100--_601600;
Array _num(*) var1-var90;
Do i=1 to dim(_char);
 _num(i) = input(_char(i),4.); 
End;
RUN;

I am receiving the following error: ERROR: Array subscript out of range at line 64 column 6. Line 64 contains the input statement.

2
How was data originally loaded? From flatfiles (csv, xlsx, xml), database connections, with datalines? If so, please provide such code to adjust column types at the source.Parfait
@Reeza. Is there something I need to change in the INPUT statement?Maldini
Try explicitly listing the variables out and see if it works.Reeza

2 Answers

0
votes

Yes, an array solves this issue. You will want a simple way to list the variables so look into SAS variable lists as well. For example if your converting all character variables between first and last you could list them as first_var-character-last_var.

The rename/drop are illustrated in other questions across SO.

DATA conversion_subset;
 SET have;
 Array old_var(50) $ first-character-last;
 Array new_var(50) var1-var50;
 Do i=1 to 50;
     new_var(i) = input(oldvar(i),4.); 
  End;
 RUN;
0
votes

As @Parfait suggests, it would be best to adjust it when you are getting it, rather than after it is already in a SAS data set. However, if you're given the data set and have to convert that, that's what you have to do. You can add a WHERE clause to the PROC SQL to exclude variables that should not be converted. If you do so, they won't be in the final data set unless you add them in the CREATE TABLE's SELECT clause.

PROC CONTENTS DATA=have OUT=havelist NOPRINT ;
RUN ; %* get variable names ;

PROC SQL ;
  SELECT 'INPUT(' || name || ',4.) AS ' || name
    INTO :convert SEPARATED BY ','
    FROM havelist
  ; %* create the select statement ;

  CREATE TABLE conversion_subset AS
    SELECT &convert
    FROM have
  ;
QUIT ;

If excluding variables is an issue and/or you want to use a DATA step, then use the PROC CONTENTS above and follow with:

PROC SQL ;
  SELECT COMPRESS(name || '_n=INPUT(' || name || ',4.)'),
         COMPRESS(name || '_n=' || name),
         COMPRESS(name)
    INTO :convertlst SEPARATED BY ';',
         :renamelst SEPARATED BY ' ',
         :droplst SEPARATED BY ' '
    FROM havelist
  ;
QUIT ;

DATA conversion_subset (RENAME=(&renamelst)) ;
  SET have ;
  &convertlst ;
  DROP &droplst ;
RUN ;

Again, add a where clause to exclude variables that should not be converted. This will automatically preserve any variables that you exclude from conversion with a WHERE in the PROC SQL SELECT.

If you have too many variables, or their names are very long, or adding _n to the end causes a name collision, things can go badly (too much data for a macro variable, illegal field name, one field overwriting another, respectively).