I have been struggling with this since yesterday and have gone over a ton of material and a number of answers on stackoverflow already. And I've also created a base code which I've pasted below.
background
we have units data for drugs, which can be combined into a combination, called a regimen. For example, drug1+drug2 would be regimen 1, drug1+drug2+drug3 would be regimen 2 and drug1+drug2+drug4 would be regimen 3. Our ultimate objective is to find out the number of patients on a regimen. This can be accomplished only by finding the %contribution (called patient share) of each regimens to the market (we can't calculate it directly from units due to the multiple uses across regimens).
basically
units = patientshare * dosing * compliance * duration of therapy * total patients
where we know units, dosing and duration of therapy and total patients, compliance and patient share will be bounded variables.
My problem is that the variables and constraints are at different levels.
Units is at drug level (and month);
dosing is at drug level;
compliance is at drug level;
duration of therapy is at regimen level;
patient share is at regimen level
This is my code and I would appreciate if someone could tell me where I'm going wrong (which is in the arrays I suspect).
PROC OPTMODEL;
SET <STRING> DRUG;
SET <STRING> REGIMEN;
SET <STRING> MONTH;
NUMBER DOSING{DRUG};
READ DATA DRUG_DATA INTO DRUG=[DRUG] DOSING;
/*PRINT DOSING;*/
NUMBER COMPLIANCE{DRUG};
READ DATA DRUG_DATA INTO DRUG=[DRUG] COMPLIANCE;
/*PRINT COMPLIANCE;*/
NUMBER DOT{drug, regimen};
READ DATA REGIMEN INTO drug=[drug]
{R in regimen}< DOT [drug, R]=col(R)>;
PRINT DOT;
NUMBER UNITS{MONTH, DRUG};
READ DATA DATASET INTO MONTH=[MONTH]
{D IN DRUG}< UNITS[MONTH, D]=COL(D)>;
/*PRINT UNITS;*/
NUMBER RATIO{MONTH};
READ DATA RATIO_1 INTO MONTH=[MONTH] RATIO;
/*PRINT RATIO;*/
/*DEFINE THE PARAMETERS*/
var ps {MONTH,DRUG} init 0.1 >=0 <=1,
annualpatients init 7000 <=7700 >=6300,
compliance init 0.1 >=0.3 <=0.8,
DOSING[RIB] INIT 5 >=6 <=4;
/*SET THE OBJECTIVE*/
min sse = sum{M IN MONTH}( (units[M,D in drug]-(ps[M,R IN REGIMEN]*annualpatients*ratio[M]*dosing[D]*compliance[D]*dot[R]*7 ))**2 );
/*SET THE CONSTRAINTS*/
constraint MONTHLY_patient_share {M IN MONTH}: sum{r is regimen}(ps[R IN REGIMEN])=1;
constraint total_patients sum{M in months, r in regimen} : ps[m,r in regimen]*annualpatients*ratio[m]=annual_patients;
expand;
solve with nlpc;
quit;
And here's the log:
2824 PROC OPTMODEL;
2825
2826 /*DEFINE THE DATA LEVELS (SETS) OF DRUGS, MONTH AND REGIMEN*/
2827
2828 SET <STRING> DRUG;
2829 SET <STRING> REGIMEN;
2830 SET <STRING> MONTH;
2831
2832 NUMBER DOSING{DRUG};
2833 READ DATA DRUG_DATA INTO DRUG=[DRUG] DOSING;
NOTE: There were 4 observations read from the data set WORK.DRUG_DATA.
2834 /*PRINT DOSING;*/
2835
2836 /*NUMBER COMPLIANCE{DRUG};*/
2837 /*READ DATA DRUG_DATA INTO DRUG=[DRUG] COMPLIANCE;*/
2838 /*PRINT COMPLIANCE;*/
2839
2840 NUMBER DOT{drug, regimen};
2841 READ DATA REGIMEN INTO drug=[drug]
2842 {R in regimen}< DOT [drug, R]=col(R)>;
ERROR: The symbol 'REGIMEN' has no value at line 2842 column 7.
2843 PRINT DOT;
ERROR: The symbol 'REGIMEN' has no value at line 2840 column 18.
2844
2845 NUMBER UNITS{MONTH, DRUG};
2846 READ DATA DATASET INTO MONTH=[MONTH]
2847 {D IN DRUG}< UNITS[MONTH, D]=COL(D)>;
NOTE: There were 12 observations read from the data set WORK.DATASET.
2848 /*PRINT UNITS;*/
2849
2850 NUMBER RATIO{MONTH};
2851 READ DATA RATIO_1 INTO MONTH=[MONTH] RATIO;
NOTE: There were 12 observations read from the data set WORK.RATIO_1.
2852 /*PRINT RATIO;*/
2853
2854 /*DEFINE THE PARAMETERS*/
2855
2856 var ps {MONTH,DRUG} init 0.1 >=0 <=1,
2857 annualpatients init 7000 <=7700 >=6300,
2858 compliance init 0.1 >=0.3 <=0.8,
2859 DOSING[RIB] INIT 5 >=6 <=4;
-
22
200
------
528
ERROR 22-322: Syntax error, expecting one of the following: ;, ',', <=, >=, BINARY, INIT,
INTEGER, {.
ERROR 200-322: The symbol is not recognized and will be ignored.
ERROR 528-782: The name 'DOSING' is already declared.
2860
2861 /*SET THE OBJECTIVE*/
2862 min sse = sum{M IN MONTH}( (units[M,D in drug]-(ps[M,R IN
- - -- -
-
-
-
537 651 631 651
537
537
648
ERROR 537-782: The symbol 'D' is unknown.
ERROR 651-782: Subscript 2 must be a string, found a number.
ERROR 648-782: The subscript count does not match array 'DOT', 1 NE 2.
--
-
631
647
ERROR 631-782: The operand types for 'IN' are mismatched, found a number and a set<string>.
ERROR 647-782: The name 'compliance' must be an array.
2862! min sse = sum{M IN MONTH}( (units[M,D in drug]-(ps[M,R IN
-
-
-
537
651
537
2862! REGIMEN]*annualpatients*ratio[M]*dosing[D]*compliance[D]*dot[R]*7 ))**2 );
ERROR 537-782: The symbol 'R' is unknown.
ERROR 651-782: Subscript 1 must be a string, found a number.
2863
2864
2865 /*SET THE CONSTRAINTS*/
2866 constraint MONTHLY_patient_share {M IN MONTH}: sum{r in regimen}(ps[R IN REGIMEN])=1;
-
648
ERROR 648-782: The subscript count does not match array 'ps', 1 NE 2.
2867 constraint total_patients sum{M in months, r in regimen} : ps[m,r in
---
22
76
2867! regimen]*annualpatients*ratio[m]=annual_patients;
ERROR 22-322: Syntax error, expecting one of the following: !!, (, *, **, +, -, .., /, :, <=, <>,
=, ><, >=, BY, CROSS, DIFF, ELSE, INTER, SYMDIFF, TO, UNION, [, ^, {, ||.
ERROR 76-322: Syntax error, statement will be ignored.
2868
2869 expand;
NOTE: Previous errors might cause the problem to be resolved incorrectly.
ERROR: The constraint 'MONTHLY_patient_share' has an incomplete declaration.
NOTE: The problem has 50 variables (0 free, 0 fixed).
NOTE: The problem has 0 linear constraints (0 LE, 0 EQ, 0 GE, 0 range).
NOTE: The problem has 0 nonlinear constraints (0 LE, 0 EQ, 0 GE, 0 range).
NOTE: Unable to create problem instance due to previous errors.
2870 solve with nlpc;
ERROR: No objective has been specified at line 2870 column 6.
2871 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE OPTMODEL used (Total process time):
real time 0.07 seconds
cpu time 0.07 seconds
optmodel
is taking a long time to run, it generally means it's doing a huge amount of calculations for every iteration. This can come from an extremely complex problem, or something isn't being summed correctly. Usually it's the latter. In your case, I would check to make sure your objective function is specified properly. When you hit stop, check the log. It'll give you how many iterations it did. That can help too. – Stu Sztukowski