0
votes

I need a way to dynamically return the number of variables in the current data step.

Using SAS NOTE 24671: Dynamically determining the number of observations and variables in a SAS data set, I have come up with the following macro.

%macro GetVarCount(dataset);
  /* Open assigns ID to open data set.  Assigns 0 if DNE */
  %let exists = %sysfunc(open(&dataset));

  %if &exists %then
  %do;
    %let returnValue  = %sysfunc(attrn(&exists, nvars));

    %let closed       = %sysfunc(close(&exists));
  %end;
  /* Output error if no dataset */
  %else %put %sysfunc(sysmsg());

  &returnValue
%mend;

Unfortunately, this errors out on an initial pass of a data set since the data set has not yet been created. After the first pass, and a dataset with 0 observations has been created, the macro can access the table and the number of variables.

For instance,

data example;
  input x y;

  put "NOTE: [DEV] There are %GetVarCount(example) variables in the EXAMPLE data set.";

  datalines;
  1 
  2
  ;
run;

The first run produces:

ERROR: File WORK.EXAMPLE.DATA does not exist.
WARNING: Apparent symbolic reference RETURNVALUE not resolved.

NOTE: [DEV] There are &returnValue variables in the EXAMPLE data set.

The second run produces:

NOTE: [DEV] There are 2 variables in the EXAMPLE data set.

Is there a way to get the number of variables in a data set first time the data step is run?

2
Doesn't the log do that already? If it runs successfully, it outputs Dataset x created with X variables and Y observations? I don't see how this would add anything. Also, at what stage, a data step can create variables so are you interested in the input data set, final output data set, or something else? Is it all variables processed or only ones output in the final dataset as well.Reeza
If you do want to do this, one way may be using CALL VNEXT, but note that automatic variables are listed and you'll need to filter those out. support.sas.com/documentation/cdl/en/lefunctionsref/69762/HTML/…Reeza

2 Answers

2
votes

In your example, you're trying to determine the number of active variables in a data step - this isn't necessarily the same as the number of variables that will be in the output data set, because (a) there might not be an output data set and (b) some of the variables might get dropped.

With that caveat in mind, if you really want to do that, then this works:

data fred;
  length x y z $ 20 f g 8;
  array vars_char _character_;
  array vars_num _numeric_;
  total_vars = dim(vars_char) + dim(vars_num);
  put "Vars in data step: " total_vars;
run;

This works by using the special _character_ and _numeric_ keywords to create arrays of all character and numeric vars in the current buffer, and the dim() function to get the sizes of those arrays.

It will only count variables that exist when the arrays are declared, so it doesn't count total_vars in this case.

You could wrap this in a macro like:

%macro var_count(var_count_name):
  array vars_char _character_;
  array vars_num _numeric_;
  &var_count_name = dim(vars_char) + dim(vars_num);
%mend;

and then use it like:

data fred;
  length x y z $ 20 f g 8;
  %var_count(total_vars);
  put "Vars in data step: " total_vars;
run;
0
votes

Try to open a dataset that has already been created.

The 'open' function requires the dataset that WILL be open to exist, I think you want 'open' to give you an ID of the already open dataset; that is not the case.

The reason it works only after the first pass (not just the second), is because the first pass created an empty dataset with metadata regarding the variables it contains.

Use a library to permanently store your dataset first and then try your macro to read from it:

    Data <lib>.dataset;

update:

@Reeza already gave you the answer in the comments.

Another alternative: Using put _all_; will print all the variables to the log, if you write the put into a file and then read it and count the '=' signs you can get the variable count too. Just remove _n_ and _ERROR_ from the count.