1
votes

I have a set of multiple choice responses from a survey with 45 questions, and I've placed the correct responses as my first observation in the dataset.

In my DATA step I would like to set values to 0 or 1depending on whether the variable in each observation matches the same variable in the first observation, I want to replace the response letter (A-D) with the 0 or 1 in the dataset, how do I go about doing that comparison?

I'm not doing any grouping, so I believe I can access the first row using First.x, but I'm not sure how to compare that across each variable(answer1-answer45).

| Id  | answer1 | answer2 | ...through answer 45
|:-------------|---------:|
| KEY |      A |        B |    
| 2   |      A |        C |
| 3   |      C |        D |   
| 4   |      A |        B |    
| 5   |      D |        C |  
| 6   |      B |        B |

Should become:

| Id  | answer1 | answer2 | ...through answer 45
|:-------------|---------:|
| KEY |      A |        B |    
| 2   |      1 |        0 |
| 3   |      0 |        0 |   
| 4   |      1 |        1 |    
| 5   |      0 |        0 |  
| 6   |      0 |        1 |

Current code for reading in the data:

DATA TEST(drop=name fill answer0);
   INFILE SCORES DSD firstobs=2;
   length id $4;
   length answer1-answer150 $1;
   INPUT name $ fill id $ (answer0-answer150) ($);
RUN;

Thanks in advance!

2

2 Answers

2
votes

Here's how I might do it. Create a data set to PROC COMPARE the KEY to the observed. Then you have X for not matching key and missing for matched. You can then use PROC TRANSREG to score the 'X.' to 01. PROC TRANSREG also creates macro variables which contain the names of the new variables and the number.

From log NOTE: _TRGINDN=2 _TRGIND=answer1D answer2D

data questions;
   input id:$3. (answer1-answer2)(:$1.);
   cards;
 KEY A  B 
 2   A  C 
 3   C  D 
 4   A  B 
 5   D  C 
 6   B  B 
;;;;
   run;
data key;
   if _n_ eq 1 then set questions(obs=1);
   set questions(keep=id firstobs=2);
   run;
proc compare base=key compare=questions(firstobs=2) out=comp outdiff noprint;
   id id;
   run;
options validvarname=v7;
proc transreg design data=comp(drop=_type_ type=data);
   id id;
   model class(answer:) / noint;
   output out=scored(drop=intercept _:);
   run;
%put NOTE: &=_TRGINDN &=_TRGIND;

enter image description here

1
votes

I don't have my SAS license here at home, so I can't actually test this code. I'll give it me best shot, though ...

First, I'd keep my correct answers in a separate table, and then merge it with the answers from the respondents. That also makes the solution scalable, should you have more multiple choice solutions and answers in the same table, since you'd be joining on the assignment ID as well.

Now, import all your correct answers to a table answers_correct with column names answer_correct1-answer_correct45.

Then, merge the two tables and determine the outcome for each question.

DATA outcome;
    MERGE answers answers_correct;
    * We will not be using any BY.;
    * If you later add more questionnaires, merge BY the questionnaire ID;
    ARRAY answer(*) answer1-answer45;
    ARRAY answer_correct(*) answer_correct1-answer_correct45;
    LENGTH result1-result45 $1;
    ARRAY result(*) result1-result45;
    DROP i;
    FOR i = 1 TO DIM(answer);
        IF answer(i) = answer_correct(i) THEN result(i) = '1';
        ELSE result(i) = '0';
    END;
RUN;