0
votes

I want to find the best revenue for every year and revtype. As seen on the image, if month apr is having the best sales, then apr will come under column best_sale. If there are two months having equal highest values, then I want both of them in the best_sale column.

enter image description here

data year;
  set similar;
  array k(*) jan--dec;

  do i=1 to dim(k);
    do j=i to dim(k);
      if k(i)=max(of k(*)) then best_sale=vname(k(i));
      if best_sale=vname(k(j)) then output year;
    end;
  end;
  drop i j;
run;


proc sort data=year nodup out=y;
  by year;
run;

data best;
  length cat $100.;
    do until (last.year);
      set y;
      by year;
      cat=catx(',',cat,best_sale);
    end;
  drop best_sale;
run;

When I'm running the above code, it does not consider revtype because the do until loop is running for year only. But I want the yearly best sale for both revtypes, "cargo" and "passenger".

2

2 Answers

0
votes

With regards to getting the two largest values from your array, try the ordinal function:

data year;
set similar;
array k(*) jan--dec;
do i = 1 to dim(k);
  do j = 0 to 1;
    if k[i] = ordinal(dim(k) - j, of k[*]) then do;
      best_sale = vname(k[i]);
      output year;
    end;
  end;
end;
run;
0
votes

I am not sure I get the entire picture, but this code might get you a step further. Please provide additional details if this is not the output you would like to produce. (Wasn't sure if you wanted to have tied best sales show up on multiple lines in the final step, and whether you had, in your similar dataset, several entries having the same year and revtype -- in which case providing a more complete sample dataset would prove useful).

data have;
  informat year 8. revtype $24. jan feb mar apr may jun jul aug sep oct nov dec 8.;
  input year revtype jan--dec;
  datalines;
1997 cargo 100 50 125 200 . . . . . . . . .
1997 passenger 200 256 235 256 . . . . . . . .
1998 cargo 300 111 435 109 . . . . . . . .
;

data want;
  set have;
  format max_sales 8. best_months $24.;
  array month(12) jan--dec;

  max_sales = max(of month(*));

  do _i = 1 to 12;
    if month(_i) = max_sales then 
      best_months = catx(",", best_months, vname(month(_i)));
  end;

  drop max_sales _i;
run;

Results

Resulting dataset