0
votes

Total beginner at SAS. I wanted to drop a list of variables from my inputds. This list itself is present as observations in another dataset. After doing some googling, I found this excellent paper on the topic.

http://www2.sas.com/proceedings/sugi30/028-30.pdf

So i used the following code to make a list in a macro variable:

/*make a list of variables as a macro variable */

data _null_;

length allvars $1000;

retain allvars ' ';

set to_drop end=eof;

allvars = trim(left(allvars))||' '||left(_name_);

if eof then call symput('varlist', allvars);

run;

I am facing three problems now:

1) When I %PUT &VARLIST , the log displays only 31 of the variables whereas my list is actually 2000+ variables.

2) I don't clearly understand what the statement: trim(left(allvars)) || ' ' || left(_name_); is doing. I know trim removes leading spaces and left is to align left character strings but cannot understand the full statement.

3) Then I try to drop it from my inputds using the following code, I get a warning message and the drop doesn't happen:

    data inputds2 (drop = &varlist);
    set inputds;
    run;



WARNING: The variable avg_weighted in the DROP, KEEP, or RENAME list has never been referenced.

NOTE: There were 43662 observations read from the data set WORK.INPUTDS.

NOTE: The data set WORK.INPUTDS2 has 43662 observations and 3465 variables.

In reality my variable name reads like : avg_weighted_minutes_view_3739 avg_weighted_minutes_view_7963 avg_weighted_minutes_view_(XXXX) The last 4 digits are random. The are SAS generated names since my labels contain spaces.

EDIT: Tried using another code which is working partially - it makes a bigger list- around 1000 of the 2000+ variables in the &VARLIST macro variable.

data _null_;

set to_drop;

call symput('varlist',trim(

resolve('&varlist')

)||' '||trim(_name_));

run;

%put &varlist;
1

1 Answers

3
votes

I'll answer in the order provided.

1.) If you have 2,000+ variables that you want to drop, 1,000 characters is not enough to hold the list of variable names, so the list is getting truncated. You'll need to allocate more space to the allvars variable in your null data step, like so:

data _null_;
    length allvars $ 10000; /* You may need even more! */
    retain allvars;
    set to_drop end=eof;
    allvars = trim(left(allvars)) || ' ' || left(_name_);
    if eof then call symput('varlist', allvars);
run;

You may be getting the log message you noted because one of the variable names was cut off when the list reached 1,000 characters in length. Allocating more space will help that.

2.) The trim() function removes trailing blanks from a character string. left() left-aligns the character value within the string, so any leading blanks are pushed to the back. || is the string concatenation operator in SAS. So the full statement is doing this:

  • Removing whitespace from the list so that " x y " becomes "x y".
  • Adding a blank to the end to delimit variable names, so "x y" becomes "x y ".
  • Adding the next variable name to the string, e.g. "x y " becomes "x y z". The last variable name is left-aligned so that you only have one space between variable names.

3.) You're getting that log message because either there's a variable in the dataset to_drop that does not exist in the input dataset (which doesn't make sense if to_drop was created as output from PROC CONTENTS or similar), or the list was truncated as I mentioned. To avoid the issue if to_drop is not output from PROC CONTENTS, you can filter out the variables that aren't in the input dataset like so:

proc sql;
    create table to_drop2 as
    select distinct a._name_
    from to_drop as a
    inner join dictionary.columns as b
    on a._name_ = b.name
    where b.memname = 'INPUTDS';
quit;

But if the issue is that the name is getting cut off, allocating more length will fix that.


Perhaps a better way to go about this is to do it all in one step using PROC SQL. This will make sure that the macro variable varlist is as long as it needs to be, up to the maximum allowable length.

proc sql noprint;
    select distinct a._name_
    into :varlist
    separated by ' '
    from to_drop as a
    inner join dictionary.columns as b
    on a._name_ = b.name
    where b.memname = 'INPUTDS';
quit;

Note that joining on dictionary.columns is unnecessary if to_drop only contains variables in inputds.