0
votes

I'm trying to model a binary outcome (p1ODD) on binary predictor variables (c1kdscc3, c1kdscc4 and c1kdscc5). When I try to do PROC GENMOD, my log indicates that I have an invalid reference value for c1kdscc3. It also tells me that there are no valid observations due to invalid responses in the response variable although earlier on in my code I have defined everything.

Here is the problematic code that appears before PROC GENMOD section:

PROC FORMAT;
Value c1kdscc_binfmt 
0 = "[3,4,5] Often or more (Ref)" 
1 = "[1,2] Never/Seldom"; 
Value p1ODD_binfmt 
0 = "Negative (Ref)"
1 = "Positive";
RUN;


TITLE "Logistic Regression Using PROC GENMOD";
PROC GENMOD DATA=MY;
CLASS c1kdscc3 (REF= "Often or more (Ref)") / PARAM = ref;
MODEL p1ODD = c1kdscc3 / DIST= binomial LINK=log SCALE=1;
RUN; QUIT; 

Would anyone know if I should fix how I define my reference values for c1kdscc3 to c1kdscc5 and how best to re-write my response variable to work in PROC GENMOD?

Sample Data: 
    Age     p1ODD       c1kdscc3    clkdscc4    clkdscc5
    12      Positive    Very Often  Always      Always
    16      Positive    Seldom     Quite Often  Seldom
    14      Negative    Very Often  Always      Seldom
    17      Negative    Quite Often Seldom      Very Often
    13      Negative    Quite Often Quite Often Seldom
    17      Negative    Quite Often Quite Often Never

Log and error messages:

172        /*Analysis using GENMOD*/
 173        
 174        
 175        TITLE "Logistic Regression Using Proc GENMOD";
 176        PROC GENMOD DATA=MY;
 177        CLASS c1kdscc3 (REF= "Often or more (Ref)") / PARAM = ref;
 178        MODEL p1ODD = c1kdscc3 / DIST= binomial LINK=log SCALE=1;
 179        RUN;


 ERROR: Invalid reference value for c1kdscc3.
 ERROR: No valid observations due to invalid or missing values in the response, explanatory, offset, frequency, or weight variable.
 NOTE: The SAS System stopped processing this step because of errors.

Thanks!

1
please add a few lines of sample data to help with debugging.DomPazz
You're missing a quotation mark in your proc format statement for the 1, that may have caused issues with applying the format correctly. Note that the text needs to match exactly and it doesn't in your code as well.Reeza
@DomPazz I added some of the sample data. Thanks!Irina Oltean
@Reeza The original code had the quotations. I think the problem is something else. Thanks!Irina Oltean
Add the log and actual error messages you're getting as well then. I can only comment on what we can see here.Reeza

1 Answers

0
votes

It's the mismatch between your formatted value in PROC FORMAT and the value you specify in the CLASS statement that causes the issue. I can replicate the problem and error with the code below.

Fix it by changing:

CLASS c1kdscc3 (REF= "Often or more (Ref)") / PARAM = ref;

to match your format:

0 = "[3,4,5] Often or more (Ref)" 

So the final code should look like:

CLASS c1kdscc3 (REF= "[3,4,5] Often or more (Ref)") / PARAM = ref;

Code to replicate the issue if desired, note I had to do this because we can't run your code without:

proc format ;
value $ myBrand_fmt
'ice1' = 'Ice #1'
'ice2' = 'Ice #2'
'ice3' = 'Ice #3';
run;

data Icecream;
   input count brand$ taste$;
   datalines;
70  ice1 vg
71  ice1 g
151 ice1 m
30  ice1 b
46  ice1 vb
20  ice2 vg
36  ice2 g
130 ice2 m
74  ice2 b
70  ice2 vb
50  ice3 vg
55  ice3 g
140 ice3 m
52  ice3 b
50  ice3 vb
;

proc genmod data=Icecream rorder=data;
   freq count;
   class brand (ref='#1');
   format brand $mybrand_fmt.;
   model taste = brand / dist=multinomial
                         link=cumlogit
                         aggregate=brand
                         type1;

run;

You also have the issue where you're data doesn't appear to match your format specified so I'm not sure what to say about that.

You've specified format definitions but not applied them and your data doesn't appear to align with your format definitions. But that's more problems than can be answered within a single answer or question. You may want to back up several steps and understand how to first get your data set up properly and your formats working and then move on to the GENMOD process.