1
votes

I have the following data:

Year   Country    Score 
----   -------    -----
2007     AU         76        
2007     SG         78        
2008     AU         56        
2008     SG         90        
2009     AU         82       
2009     SG         48        

Suppose I want to show the Score in each country in each year(group with year) by using gplot, such as: output

I have tried:

plot Score*(country year);

and

plot country*year=score;

But neither of them can work. I am not familiar with gplot, so how to achieve this?

2

2 Answers

0
votes

/* First grab the 2007 year data you want to plot */

PROC SQL;
 create table data2007 as
 select * 
 from data_original
 where year=2007;
QUIT; 

/* Then plot data using the symbol statement*/

symbol interpol=boxt;
proc gplot data=data2007;
 plot score*country;
run;
quit; 

/* You can also research PROC UNIVARIATE and PROC BOXPLOT to achieve similar results */

If you want to do this by year.... I believe the following will work:

symbol interpol=boxt;
proc gplot data=data2007;
 plot score*country;
by year;
run;
quit; 

If you wish to have all the year and all the country you could:

PROC SQL;
 create table new_data as
 select year
   , country
   , LEFT(TRIM(country) || " _ " || year) as country_year
 from data_original
QUIT; 

symbol interpol=boxt;
proc gplot data=data2007;
 plot score*country_year;
run;
quit; 

Be aware of the number of levels to be graphed.

0
votes

SGPLOT is going to be the easiest way to get this; it's much more powerful than GPLOT in many areas, and nice boxplots is one of them.

This gets pretty close to what you want. You may need to do a few things to get the legend exactly what you want, but it does make separate box plots grouped the way you ask. I threw in some extra data to make the boxplots look realistic.

data have;
input Year   Country $    Score ;
datalines;
2007     AU         76   
2007     AU         74
2007     AU         78 
2007     SG         78
2007     SG         80
2007 SG 76 
2008     AU         56        
2008     SG         90        
2009     AU         82       
2009     SG         48  
2008 AU 54
2008 AU 58
2008 SG 88
2008 SG 92
2009 AU 78
2009 AU 86
2009 SG 44
2009 SG 52
;;;;
run;

title;
proc sgplot data=have;
 vbox score/category=country group=year groupdisplay=stacked;   *or reverse category and group depending on your preference;
run;

GPLOT is a bit trickier. The way you get groups in GPLOT is the equal sign, so:

symbol interpol=boxt;
proc gplot data=have;
 plot score*country=year;
run;
quit;

But that doesn't look nearly as nice nor does it stack adjacently. I also don't like how hard it is to make them sit in the right spot on the plot.