0
votes

I have been struggling with this since yesterday and have gone over a ton of material and a number of answers on stackoverflow already. And I've also created a base code which I've pasted below.

background

we have units data for drugs, which can be combined into a combination, called a regimen. For example, drug1+drug2 would be regimen 1, drug1+drug2+drug3 would be regimen 2 and drug1+drug2+drug4 would be regimen 3. Our ultimate objective is to find out the number of patients on a regimen. This can be accomplished only by finding the %contribution (called patient share) of each regimens to the market (we can't calculate it directly from units due to the multiple uses across regimens).

basically

units = patientshare * dosing * compliance * duration of therapy * total patients

where we know units, dosing and duration of therapy and total patients, compliance and patient share will be bounded variables.

My problem is that the variables and constraints are at different levels.

  • Units is at drug level (and month);

  • dosing is at drug level;

  • compliance is at drug level;

  • duration of therapy is at regimen level;

  • patient share is at regimen level

This is my code and I would appreciate if someone could tell me where I'm going wrong (which is in the arrays I suspect).

   PROC OPTMODEL;


   SET <STRING> DRUG;
   SET <STRING> REGIMEN;
   SET <STRING> MONTH;

   NUMBER DOSING{DRUG};
   READ DATA DRUG_DATA INTO DRUG=[DRUG] DOSING;
   /*PRINT DOSING;*/

   NUMBER COMPLIANCE{DRUG};
   READ DATA DRUG_DATA INTO DRUG=[DRUG] COMPLIANCE;
   /*PRINT COMPLIANCE;*/

   NUMBER DOT{drug, regimen};
   READ DATA REGIMEN INTO drug=[drug] 
   {R in regimen}< DOT [drug, R]=col(R)>;
   PRINT DOT;

   NUMBER UNITS{MONTH, DRUG};
   READ DATA DATASET INTO MONTH=[MONTH]
   {D IN DRUG}< UNITS[MONTH, D]=COL(D)>;
   /*PRINT UNITS;*/

     NUMBER RATIO{MONTH};
    READ DATA RATIO_1 INTO MONTH=[MONTH] RATIO;
   /*PRINT RATIO;*/

   /*DEFINE THE PARAMETERS*/

   var ps {MONTH,DRUG} init 0.1 >=0 <=1,
   annualpatients init 7000 <=7700 >=6300, 
  compliance init 0.1 >=0.3 <=0.8,
    DOSING[RIB] INIT 5 >=6 <=4;

   /*SET THE OBJECTIVE*/
    min sse = sum{M IN MONTH}( (units[M,D in drug]-(ps[M,R IN     REGIMEN]*annualpatients*ratio[M]*dosing[D]*compliance[D]*dot[R]*7 ))**2 );


   /*SET THE CONSTRAINTS*/
   constraint  MONTHLY_patient_share {M IN MONTH}:  sum{r is regimen}(ps[R IN REGIMEN])=1;
   constraint  total_patients sum{M in months, r in regimen} : ps[m,r in regimen]*annualpatients*ratio[m]=annual_patients;

 expand;
 solve with nlpc;
 quit;

And here's the log:

       2824  PROC OPTMODEL;
       2825
2826  /*DEFINE THE DATA LEVELS (SETS) OF DRUGS, MONTH AND REGIMEN*/
2827
2828  SET <STRING> DRUG;
2829  SET <STRING> REGIMEN;
2830  SET <STRING> MONTH;
2831
2832  NUMBER DOSING{DRUG};
2833  READ DATA DRUG_DATA INTO DRUG=[DRUG] DOSING;
NOTE: There were 4 observations read from the data set WORK.DRUG_DATA.
2834  /*PRINT DOSING;*/
2835
2836  /*NUMBER COMPLIANCE{DRUG};*/
2837  /*READ DATA DRUG_DATA INTO DRUG=[DRUG] COMPLIANCE;*/
2838  /*PRINT COMPLIANCE;*/
2839
2840  NUMBER DOT{drug, regimen};
2841  READ DATA REGIMEN INTO drug=[drug]
2842  {R in regimen}< DOT [drug, R]=col(R)>;
ERROR: The symbol 'REGIMEN' has no value at line 2842 column 7.
2843  PRINT DOT;
ERROR: The symbol 'REGIMEN' has no value at line 2840 column 18.
2844
2845  NUMBER UNITS{MONTH, DRUG};
2846  READ DATA DATASET INTO MONTH=[MONTH]
2847  {D IN DRUG}< UNITS[MONTH, D]=COL(D)>;
NOTE: There were 12 observations read from the data set WORK.DATASET.
2848  /*PRINT UNITS;*/
2849
2850  NUMBER RATIO{MONTH};
2851  READ DATA RATIO_1 INTO MONTH=[MONTH] RATIO;
NOTE: There were 12 observations read from the data set WORK.RATIO_1.
2852  /*PRINT RATIO;*/
2853
2854  /*DEFINE THE PARAMETERS*/
2855
2856  var ps {MONTH,DRUG} init 0.1 >=0 <=1,
2857      annualpatients init 7000 <=7700 >=6300,
2858      compliance init 0.1 >=0.3 <=0.8,
2859      DOSING[RIB] INIT 5 >=6 <=4;
                -
                22
                200
          ------
          528
ERROR 22-322: Syntax error, expecting one of the following: ;, ',', <=, >=, BINARY, INIT,
              INTEGER, {.

ERROR 200-322: The symbol is not recognized and will be ignored.

ERROR 528-782: The name 'DOSING' is already declared.

2860
2861  /*SET THE OBJECTIVE*/
2862  min sse = sum{M IN MONTH}( (units[M,D in drug]-(ps[M,R IN
                                          -        -         --        -
-
-
-
                                          537      651       631       651
537
537
648
ERROR 537-782: The symbol 'D' is unknown.

ERROR 651-782: Subscript 2 must be a string, found a number.

ERROR 648-782: The subscript count does not match array 'DOT', 1 NE 2.

                                            --
-
                                            631
647
ERROR 631-782: The operand types for 'IN' are mismatched, found a number and a set<string>.

ERROR 647-782: The name 'compliance' must be an array.

2862! min sse = sum{M IN MONTH}( (units[M,D in drug]-(ps[M,R IN
                                                           -
-
-
                                                           537
651
537
2862! REGIMEN]*annualpatients*ratio[M]*dosing[D]*compliance[D]*dot[R]*7 ))**2 );
ERROR 537-782: The symbol 'R' is unknown.

ERROR 651-782: Subscript 1 must be a string, found a number.

2863
2864
2865  /*SET THE CONSTRAINTS*/
2866  constraint  MONTHLY_patient_share {M IN MONTH}:  sum{r in regimen}(ps[R IN REGIMEN])=1;
                                                                                        -
                                                                                        648
ERROR 648-782: The subscript count does not match array 'ps', 1 NE 2.

2867  constraint  total_patients sum{M in months, r in regimen} : ps[m,r in
                                 ---
                                 22
                                 76
2867! regimen]*annualpatients*ratio[m]=annual_patients;
ERROR 22-322: Syntax error, expecting one of the following: !!, (, *, **, +, -, .., /, :, <=, <>,
              =, ><, >=, BY, CROSS, DIFF, ELSE, INTER, SYMDIFF, TO, UNION, [, ^, {, ||.

ERROR 76-322: Syntax error, statement will be ignored.

2868
2869       expand;
NOTE: Previous errors might cause the problem to be resolved incorrectly.
ERROR: The constraint 'MONTHLY_patient_share' has an incomplete declaration.
NOTE: The problem has 50 variables (0 free, 0 fixed).
NOTE: The problem has 0 linear constraints (0 LE, 0 EQ, 0 GE, 0 range).
NOTE: The problem has 0 nonlinear constraints (0 LE, 0 EQ, 0 GE, 0 range).
NOTE: Unable to create problem instance due to previous errors.
2870       solve with nlpc;
ERROR: No objective has been specified at line 2870 column 6.
2871       quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE OPTMODEL used (Total process time):
      real time           0.07 seconds
      cpu time            0.07 seconds
1
Can you post the error you're getting? It'll give us an easier place to start looking.Stu Sztukowski
@StuSztukowski, Hi, thanks for looking into it. I've actually made some progress with the code such that I can run it without any errors, but I don't think that it is 'running' since proc optmodel goes on and on (more than 10 minutes) while the same exercise in excel took less than 3 minutes. I've added the new code aboveDeepthi Pullarkat
Well then you've made good progress! I can assure you that optmodel is indeed running, but there's something misspecified in either your contraint or objective function. When optmodel is taking a long time to run, it generally means it's doing a huge amount of calculations for every iteration. This can come from an extremely complex problem, or something isn't being summed correctly. Usually it's the latter. In your case, I would check to make sure your objective function is specified properly. When you hit stop, check the log. It'll give you how many iterations it did. That can help too.Stu Sztukowski
I've now added the edited code (corrected for a logical mistake) and it's throwing errors!Deepthi Pullarkat
Hi @DeepthiPullarkat, thanks for marking the answer correct, but it would be helpful if you also "roll back" the question. Otherwise the answer isn't really correct. To roll back the version, you can click on "Edit" below the question, then see the revisions in the combo box labeled "Rev". Don't forget to copy what you currently have as a new question, if that is still open.Leo

1 Answers

0
votes

The first error (on log line 2842) happens because OPTMODEL doesn't have the data for set REGIMEN when you refer to it.

I can't see your datasets, but it appears that you have a DRUG-by-REGIMEN matrix. So you need to tell OPTMODEL the list of the REGIMEN before using it to read the columns. Here are two efficient ways to read the column names into OPTMODEL:

/* a simple example matrix with numbers and their squares */
data m_by_n (drop=i);
    do i = 1 to 3;
        n = i;
        n_square = i * i;
        output;
    end;
run;

/* get the variable names in the `name` column, plus other information */
proc contents data=m_by_n out=contents_of_m_by_n; quit;

proc optmodel;
    set      ROWS;
    set<str> COLS;
    num val{ROWS,COLS};

    /* use the output from PROC CONTENTS */
    read data contents_of_m_by_n  into COLS=[name];
    read data m_by_n              into ROWS=[_N_] 
        { j in COLS }< val[_N_,j] = col( j ) >; 
    put val[*]=;

    /* Or, all within OPTMODEL */
    num dsid init open('m_by_n');
    set COLS2 = setof{i in 1 .. attrn(dsid,'nvars')} varname(dsid,i);
    read data m_by_n              into ROWS=[_N_] 
        { j in COLS2 }< val[_N_,j] = col( j ) >; 
    put val[*]=;
quit;