0
votes

My dataset has the following variables:

Actual
Expected

First of all, I want to create a new variable 'Ratio' = sum(Actual)/sum(Expected). For the first observation I want this expression to include all of the values of Actual and Expected. For the second iteration, I want to repeat this expression using only the second observation and onwards. Then for the third iteration, I want to include the third observation and onwards.

This is an example of the data and the formula as it would look in Excel.

actual  expected    ratio  
15      33          =SUM(A2:$A$6)/SUM(B2:$B$6)  
10      50          =SUM(A3:$A$6)/SUM(B3:$B$6)  
20      64          =SUM(A4:$A$6)/SUM(B4:$B$6)  
60      77          =SUM(A5:$A$6)/SUM(B5:$B$6)  
45      81          =SUM(A6:$A$6)/SUM(B6:$B$6)  

Secondly, the dataset could have any number of observations so I would like the expression to continue until the 'Ratio' is below a value that I manually specify. Ideally the program will also incorporate the creation of a fourth variable with the number of observations excluded from the expression. So for the first observation it would be '0', the second observation '1', the third observation '2' and so on.

I think this will need an array too? Thanks in advance.

1

1 Answers

0
votes

You're asking for code to be written for you, which is off topic (rather than asking for help with code you've written); however, you do in part have an interesting question as far as technique; I'll explain that technique, and if you need help with the technique adjust your question or ask a new one.

What you'll want to do here is reverse sort your data, if possible. If it doesn't have anything to sort by, you can either add an observation counter:

data temp;
  set have;
  obs=_n_;
run;

Or you can use POINT to reverse-order your dataset, though this is somewhat slower. This is the basic form of the latter technique - if you can reverse sort, then you don't need the do loop or the set options.

data want;
  do obsnum = nobs to 1 by -1;
    set have nobs=nobs point=obsnum;
    act_sum+actual;   *accumulator for actual;
    exp_sum+expected; *accumulator for expected;
    ratio=act_sum/exp_sum;
  end;
run;

From here you should be able to work out the rest of your question. You will of course have to re-sort by the proper sort method afterwards.