0
votes

I am creating a bunch of frequency tables using proc tabulate, and I have to weigh the percentage according to a set of weights regarding the age of each person in my dataset. My problem is that it seems like the weights have any impact on my results. I know, I can do this with proc freq, but my tables are pretty detailed, and therefore I am using proc tabulate.

I have included an example of a dataset, and what I have tried so far:

Data have; 
input gender wgt q1 year;
lines;
0  1.5  0  2014
0  1    1  2014
0  1.5  1  2014
0  1    1  2014
0  1.5  0  2014
1  1    1  2014
1  1    1  2014
1  1    1  2014
1  1    0  2014
1  1   1  2014
1  1    1  2014
;
run;

Proc format;
  value gender  0="boy";
                1= "girl";
  value q1f     0= "No"
                1="Yes";
run;

Proc tabulate data=have;
class gender q1 year;
weight wgt;
table gender*pctn<q1>, year*q1;
format gender gender. q1 q1f.;
run;

I know the results should be that app. 46,2 % boys have answered "No" and app. 53,8 % have answered yes, when I include the weights, but the output from the proc tabulate gives me 40 % No and 60 % yes among the boys. What have I done wrong?

1
I don't know the sas language, but without weights, the output is correct because 3/5 boys voted yes (60%). So maybe you need need to do something like table gender*pctn<q1*wgt>, year*q1 - rhavelka
To confirm - you're trying to produce something like this but using proc tabulate? proc freq data = have; table gender * q1 /nocol nopercent nofreq; weight wgt; format gender gender. q1 q1f.; run; - user667489

1 Answers

2
votes

The WEIGHT statement will affect VAR variable values, not the N count. PCT<N> is a percentage of counts. A 'FREQ' statement will affect the N count by causing internal repetition of a data point based on another variable, however FREQ does not work with fractional repetitions (values) and will round down.

From the helps

FREQ variable;

specifies a numeric variable whose value represents the frequency of the observation. If you use the FREQ statement, then the procedure assumes that each observation represents n observations, where n is the value of variable. If n is not an integer, then SAS truncates it. If n is less than 1 or is missing, then the procedure does not use that observation to calculate statistics.

The sum of the frequency variable represents the total number of observations.

WEIGHT variable;

specifies a numeric variable whose values weight the values of the analysis variables. The values of the variable do not have to be integers. PROC TABULATE responds to weight values in accordance with the following table.

Weight Value: PROC TABULATE Response

  • 0 : Counts the observation in the total number of observations
  • <0 : Converts the value to zero and counts the observation in the total number of observations
  • . : Excludes the observation

If you want to use a weight for pctN like counts, create a unity variable that is to be weighted and PCTSUM

Data have; 
input gender wgt q1 year;
unity = 1;
lines;
0  1.5  0  2014
0  1    1  2014
0  1.5  1  2014
0  1    1  2014
0  1.5  0  2014
1  1    1  2014
1  1    1  2014
1  1    1  2014
1  1    0  2014
1  1    1  2014
1  1    1  2014
;
run;

Proc tabulate data=have;
  title "Unity weighted";
  class gender q1 year;
  format gender gender. q1 q1f.;

  var unity;  %* <----------;
  weight wgt;

  table gender*unity, year*q1;  %* <---- debug, the count 'basis' for PCTSUM<q1> ;

  table gender*unity*(pctsum<q1>), year*q1;  %* <--- weighted unity PCTSUM;
run;