2
votes

SAS has several forms it uses to create output data sets from within a procedure. It is not always clear whether or not a particular procedure can generate a data set and, if it seems to be able to, it's not always clear how.

Off the top of my head, here are some examples of how widely the syntax can differ.

Example 1

proc sort data = sashelp.baseball out = baseball_sorted;
  by
    league
    division
  ;
run;

Example 2

proc means noprint data = baseball_sorted;
  by
    league
    division
  ;
  var nHits;
  output 
    out = baseball_avg_hits (drop = _TYPE_ _FREQ_)
    mean = mean_hits
  ;   
run;

Example 3

ods exclude all;
ods output 
  statistics  = baseball_statistics
  equality    = baseball_ftest
;
proc ttest data = baseball_sorted;
  class league;
  var nHits;
run;
ods exclude none;

Example 4

The PROC ANOVA OUTSTAT= option.

It seems almost as if SAS has implemented each of these willy-nilly. Is the SAS syntax dictating how to create a data set directed by some consistent approach I am not seeing or is it truly capricious and arbitrary?

3
Could be off-topic for s.o. as opinion, but interesting. If it ends up closed here, I would suggest moving to communities.sas.comQuentin

3 Answers

3
votes

For PROC code, the syntax for outputting data is often specific to that procedure, which often feels willy-nilly. (Your examples 1, 2, 4) I think PROC developers are given a lot of freedom, and remember that many of these PROCS are 30+ years old.

The great thing about the Output Delivery System (ODS, your example 3) is it provides a single syntax for outputting data, regardless of the procedure. So you can use the ODS OUTPUT statement with (almost?) any PROC. The names and structures of the output objects will of course vary between PROCs. So if you are looking for a consistent approach, I would focus on using ODS OUTPUT. ODS was added in V7 (I think).

It would be interesting to try to find an example of an output dataset which could be made by a PROC but could not be made by ODS OUTPUT. I hope there aren't any. If that is the case, you could consider the range of OUTPUT statements/options within PROCs as legacy code.

1
votes

Agree with Quentin. You have to remember that there are SAS systems out there running code written in the 80s. SAS would have a huge headache if they forced every team to rewrite all the procedures and then forced their customers to change all their code. SAS has been around since the 60s and the organic growth of the syntax is to be expected.

FWIW, having an OUT= statement makes sense on things with no graphical output. I.E. PROC SORT or PROC TRANSPOSE.

1
votes

The way I see it there are four main ways to specify the output data sets.

  1. In the PROC statement you may be able to specify some type of output statements or options, such as OUT= OUTEST=.
  2. In the main statement of the procedure, ie MODEL/TABLE can have options that allow for output. ie PROC FREQ has an OUT= on the TABLE statement.
  3. An explicit OUTPUT statement within a procedure. These are typically from older procedures. ie PROC MEANS
  4. ODS tables which are relatively newer method, more frequently used these days since the format aligns with what you'd expect to see.

Yes, there are multiple places to check, but fortunately the SAS documentation for procedures is relatively clear with the options and how to use/specify the outputs.

If I've missed anything that seems different post in the comments and I can update this.

PS. Although SAS is definitely bad, trying to navigate different packages/modules in Python to export an XLSX file isn't straight forward either. Some packages support some options others don't. I've given up on asking why these days and just accept it as peculiarities of the different languages at this point.