0
votes

I'm new to SAS EG, I usually use BASE SAS when I actually need the program, but my company is moving heavily toward EG. I'm helping some areas with some code to get data they need on an ad-hoc basis (the code won't change though).

However, during processing, we create many temporary files that are just iterations across months. I.E. if the user wants data from 2002 - 2016, we have to pull all those libraries and then concatenate them with our results. This is due to high transactional volume, the final dataset is limited to a small number of observations. Whenever I run this program though, SAS outputs all 183 of the datasteps created in the macro, making it very ugly, and sometimes the "Output Data" that appears isn't even output from the last datastep, but from an intermediary step, making it annoying to search through for the 'final output dataset'.

Is there a way to limit the datasets written to "Output Data" so that it only shows the final dataset - so that our end user doesn't need to worry about being confused?

enter image description here

Above is an example - There's a ton of output data sets that I don't care to see. I just want the final, which is located (somewhere) in that list...

Version is SAS E.G. 7.1

2

2 Answers

2
votes

EG will always automatically show every dataset that was created after the program ends. If you don't want it to show any intermediate tables, delete them at the very last step in your process.

In your case, it looks as if your temporary tables all share the name TRN. You can clean it up as such:

/* Start of process flow */

<program statements>;

/* End of process flow*/

proc datasets lib=work nolist nowarn nodetails;
    delete TRN:;
quit;

Be careful if you do this. Make sure that all of your temporary tables follow the same prefix naming scheme, otherwise you may accidentally delete tables that you need.

Another solution is to limit the number of datasets generated, and have a user-created link to the final dataset. There's an article about it here.

0
votes

The alternate solution here is to add the output dataset explicitly as an entry on your process flow, and disregard the OUTPUT window unless you need to investigate something from the intermediary datasets.

This has the advantage that it lets you look at the intermediary datasets if something goes wrong, but also lets you not have to look through all of them to see the final dataset.

You should be able to add the final output dataset to the process flow once it's created once easily, and then after that one time it will be there for you to select to look at.