0
votes

I'm trying to create a table using SAS 9.3 that shows information on current and past projects. For current projects, I want to show whether they've met various criteria ("yes", "no", OR "n/a"). In the same table, I want to show summary information of past projects (i.e. how many projects met the criteria, how many did not, and how many were n/a). Having one table to show current projects and one table to show past projects is easy. I'm struggling to show them together in a single table. Using proc tabulate, my code looks like this:

proc tabulate data = projects order=formatted missing;  
class project;
var dt criteria1 criteria2 criteria3; 
table  
    (dt=”Start Date)"*min=''*f=year_date.)  
    (criteria1="Criteria 1")*sum=''*f=ans.  
    (criteria2="Criteria 2")*sum=''*f=ans.  
    (criteria3="Criteria 3")*sum=''*f=ans.  
    ,(project='');  
format project $project_label.;  
run;  

The values for each criteria are 1 for yes, 0 for no, and . for n/a. The year format distinguishes current from past projects and the ans format shows "yes" for 1 and "no" for 0. This works for the the current projects. It also gives me the total number of past projects with "yes" answers. What I don't know how to do is the break-out for past projects showing no and n/a. (I'm also in trouble if there sum of past projects is 1 or 0 because the format would replace those with 'yes' or 'no.'

Any suggestions?

Thanks.

Brandon

Edit: I'll try to add some sample data that looks reasonable...

Criteria ActiveProject1 ActiveProject2 Past_Projects
Criteria1 yes no 5/10/5
Criteria2 yes yes 7/9/4
Criteria3 no yes 2/15/3

1
I would use a datastep instead of a proc step. That will allow you to manipulate criteria values more efectively and cleaner. On top of that, a data step is the way to go (best practice) and much cleaner.Salva
This may be a bit easier to visualize with some sample input/output data.Reeza
@Salva There's a lot of assumptions you're making by saying data step is best practice and much cleaner. Personally, I'm not a big fan of proc tabulate, but I'd guess that the above is probably as easy as coding it in a datastep. As for best practice, well that comes down to whatever standards your company has, what is more readable (by other people) and what is more maintainable (by other people).Robert Penridge
I'm not attached to Proc Tabulate so I'd welcome any other suggestions. My real challenge is showing yes/no/na for current projects and a count of yes/no/na for past projects. And doing it in the same table.Brandon Booth
@RobertPenridge For specifically this scenario, I said that datastep is the way to go, not only because the code is cleaner, but because he needs to evaluate each row in order to insert some categorical values into criteria1-criteria3 variables. Using a datastep for this need is ideal because the PDV will process one row (observation) at a time. With small datasets, there is no difference in performance; but with large to extremely large datasets, datastep is not only a best practice, it consumes less memory than proc steps. Cheers!Salva

1 Answers

0
votes

While I can't visualize what you're trying to do, one suggestion I would have is to use the ODS DOCUMENT and PROC DOCUMENT facility, or PROC REPORT.

You can in this way build your two separate tables that you like, then use PROC DOCUMENT to put them together so they show up in one place. This might suffice for what you're aiming to do.

If it doesn't, then PROC REPORT is probably more apt than PROC TABULATE when you are in some places summarizing and in other places not, if that's what you're trying to do. It allows limited data step functionality along with the summarization elements of the tabulation procs. I can't suggest a specific example because I don't understand what you're doing, but it may be the superior choice.