I am a new user to SAS so any feedback is greatly appreciated. I am trying to create a bar chart in SAS which shows the percent of patients that received a test by category (a stratification of risk) and then within that, show where the test was received (location). My dataset looks like this:
Category Test Test_location
High Risk 1 Site 1
Intermediate Risk 1 Site 2
Low Risk 0 .
Intermediate Risk 0 .
High Risk 1 Site 3
Where each patient is listed with the category they have been assigned to (variable 'Category'), an indicator variable that shows whether or not they received a test (variable 'test' where '1'=received test and '0'=did not receive test) and if they received a test, where that test took place (variable 'test_location').
I want to create a bar graph with the categories on the x axis and the yaxis showing the percentage of patients who got a test (test=1), and each bar showing how many tests occurred in Site 1, 2 and 3.
I have this code but it gives me the counts of patients who received test rather than percentages:
proc sgplot data=test;
vbar category / response=test
group=test_location groupdisplay=stack;
yaxis grid values=(0 to 100 by 10) label="Percentage of patients who received testing (%)";
label Category= "Risk Stratification";
keylegend /title="Testing Location" position=bottom;
quit;
I don't think proc sgplot has a percent stat, so I tried doing a proc freq but I can't figure out how to do that accurately for all of the variables I have.
Thanks for your help!
EDIT;
I added in percent stat like the poster below suggested, but it is not giving me the percentages I want (it gives me a pct_col output of test*category, and I want pct_row). The below code gives me the percentages I want, but I also want to add test_location to show on each bar what percentages of patients were in each location.
proc tabulate data=test_util out=freq1;
class category test;
tables category,test*rowpctn;
run;
proc sgplot data=Freq1;
where test=1;
vbar category / response=pctn_10;
quit;
Example of what I want: In the dummy dataset below, for high risk patients, for example, I want a bar that shows 75% (12 patients with tests out of the total 16 high risk patients) received tests, and then have the bar shaded to show 41.66% of those test were at Site 1, 33.34% at Site 2 and 25% at Site 3. And so on for the intermediate and low risk categories. If there is a way to label the subsections with the exact percentages, that would be great too.
Dummy data set:
data test;
infile datalines missover;
input ID Category $ Test Test_location $;
datalines;
1 High 1 Site_1
2 High 1 Site_1
3 High 1 Site_1
4 High 1 Site_1
5 High 1 Site_1
6 High 1 Site_2
7 High 1 Site_2
8 High 1 Site_2
9 High 1 Site_2
10 High 1 Site_3
11 High 1 Site_3
12 High 1 Site_3
13 High 0
14 High 0
15 High 0
16 High 0
17 Intermediate 1 Site_1
18 Intermediate 1 Site_1
19 Intermediate 1 Site_2
20 Intermediate 0
21 Intermediate 0
22 Intermediate 0
23 Intermediate 0
24 Intermediate 0
25 Intermediate 0
26 Low 1 Site_1
27 Low 1 Site_1
28 Low 1 Site_1
29 Low 1 Site_2
30 Low 1 Site_2
31 Low 1 Site_2
32 Low 1 Site_3
33 Low 0
34 Low 0
35 Low 0
36 Low 0
37 Low 0
38 Low 0
;
proc sql
there is unnecessary - you can add thewhere
in a lot of different places, any of the out= or data= statements or in the proc itself. – Joe