0
votes

I am trying to add a Data step that creates the work.orders_fin_qtr_tot data set from the work.orders_fin_tot data set. This new data set should contain new variables for quarterly sales and profit. Use two arrays to create the new variables: QtrSales1-QtrSales4 and QtrProfit1-QtrProfit4. These represent total sales and total profit for the quarter (1-4). Use the quarter number of the year in which the order was placed to index into the correct variable to add either the TotalSales or TotalProfit to the new appropriate variable.

Add a Proc step that displays the first 10 observations of the work.orders_fin_qtr_tot data set.

My issue is that I can't seem to get the two diff arrays to meld with out spaces

proc sort data=work.orders_fin_tot_qtr;
    by workqtr;
run;
data work.orders_fin_tot_qtr;
   set work.orders_fin_tot_qtr;

    array QtrSales{4} quarter1-quarter4 ;
    do i = 1 by 1 until (last.order_id);
    if workqtr=i then QtrSales{i}=totalsales;
    end;
    drop totalsales totalprofit _TYPE_  _FREQ_;
run;
proc print data=work.orders_fin_tot_qtr;
run;
1
Please update your question to include some sample data that we can use to reproduce the problem you're having, and the corresponding output that you're attempting to produce. Please post the sample data as text within your question, not as a screenshot. - user667489
Is this a homework assignment ? How much SAS experience do you have ? - Richard
will do just have to get back to my computer it is a project for a sas cert class a small portion ive been beating my head with thank you all for you help !! - Thomas Joines
I think this is what you want and are trying to accomplish. You need to add a RETAIN, or just use a PROC TRANSPOSE in a real life situation. stats.idre.ucla.edu/sas/modules/… - Reeza

1 Answers

0
votes

The syntax last.order_id is only appropriate if there is a BY statement in the DATA Step -- if not present, the last. reference is always missing and the loop will never end; so you have coded an infinite loop!

The step has drop totalsales totalprofit _TYPE_ _FREQ_. Those underscored variables indicate the incoming data set was probably created with a Proc SUMMARY.

Your orders_fin_tot data set should have columns order_id quarter (valid values 1,2,3,4), and totalsales. If the data is multi-year, it should have another column named year.

The missing BY and present last.id indicate you are reshaping the data from acategorical vector going down a column to one that goes across a row -- this is known as a pivot or transpose. The do construct you show in the question is incorrect but similar to that of a technique known in SAS circles as a DOW loop -- the specialness of the technique is that the SET and BY are coded inside the loop.

Try adjusting your code to the following pattern

data want;
  do _n_ = 1 by 1 until (last.order_id);
    SET work.orders_fin_tot;   * <--- presumed to have data 'down' a column for each quarter of an order_id;
    BY order_id;   * <--- ensures data is sorted and makes automatic flag variable LAST.ORDER_ID available for the until test;
    array QtrSales quarter1-quarter4 ;  * <--- define array for step and creates four variables in the program data vector (PDV);

    * this is where the pivot magic happens;
    * the (presumed) quarter value (1,2,3,4) from data going down the input column becomes an
    * index into an array connected to variables going across the PDV (the output row); 
    QtrSales{quarter} = totalsales;
  end;
run;

Notice there is no OUTPUT statement inside or outside the loop. When the loop completes it's iteration the code flow reaches the bottom of the data step and does an implicit OUTPUT (because there is no explicit OUTPUT elsewhere in the step).

Also, for any data set specified in code, you can use data set option OBS= to select which observation numbers are used.

proc print data=MyData(obs=10);

OBS is a tricky option name because it really means last observation number to use. FIRSTOBS is another data set option for specifying the row numbers to use, and when not present defaults to 1. So the above is equivalent to

proc print data=MyData(firstobs=1 obs=10);

OBS= should be thought of conceptually as LASTOBS=; there is no actual option name LASTOBS. The following would log an ERROR: because OBS < FIRSTOBS

proc print data=MyData(firstobs=10 obs=50);