0
votes

I'm looking for a way to use a normal variable value as a macro variable in a data step.

For example I have macro variable &statesList_Syphilis = AAA
and another macro variable &statesList_Giardia = BBB

And in a data step I have a variable Germ wich contains 2 rows: "Syphilis" and "Giardia".

In my data step I need to find AAA when iterating over the first row when Germ="Syphilis"
and BBB when iterating over the second row, when Germ="Giardia"

an attempt would look like this

%let statesList_Syphilis = AAA;
%let statesList_Giardia = BBB;

data test;
    set mytablewithgerms; * contains variable Germ ;

    * use germ and store it in &germ macro variable ;
    * something like  %let germ = germ; or call symput ('germ',germ);

    * I want to be able to do this;
    xxx = "&&statesList_&germ"; * would give xxx = "AAA" or xxx = "BBB";

    * or this;
    &&statesList_&germ = "test"; * would give AAA = "test" or BBB = "test";

    run;

I don't think this is possible, but I figured I would ask just to be sure.

Thanks!


EDIT (Following questions in the comments, I'm adding context to my specific problem, but I feel this is making things more complicated):

This was an attempt to simplify the problem.

In reality AAA and BBB are long lists of words like

"asymptomatic_1 fulminant_1 chronic_1 chronic_1 fatalFulminant_1 hepatocellular_1 compensated_1 hepatocellular_2 decompensated_1 fatalHepatocellular_1 fatalHepatocellular_2 fatalDecompensated_1"

And I don't want to store this long string in a variable, I want to iterate each word of this string in a do loop with something like:

    %do k=1 %to %sysfunc(countw(&&statesList_&germ));
        %let state = %scan(&&statesList_&germ, &k);
        * some other code here ;
    %end;

EDIT2:
here is a more complete view of my problem:

%macro dummy();

data DALY1;
    * set lengths ;
    length Germ $10 Category1 $50 Category2 $50 AgeGroupDALY $10 Gender $2 value 8 stateList$999;

    * make link to hash table ;
    if _n_=1 then do;

        *modelvalues ----------------;
        declare hash h1(dataset:'modelData');
        h1.definekey ('Germ', 'Category1', 'Category2', 'AgeGroupDALY', 'Gender') ;
        h1.definedata('Value');
        h1.definedone();
        call missing(Germ, Value, Category1, Category2);
        * e.g.
          rc=h1.find(KEY:Germ, KEY:"ssssssssss", KEY:"ppppppppppp", KEY:AgeGroupDALY, KEY:Gender);

        *states ---------------------;
        declare hash h2(dataset:'states');
        h2.definekey ('Germ') ;
        h2.definedata('stateList');
        h2.definedone();

    end;

    set DALY_agregate;

    put "°°°°° _n_=" _n_;

    DALY=0; * addition of terms ;



    rc2=h2.find(KEY:Germ); * this creates the variable statesList;

    put "statesList =" statesList;

    * here i need statesList as a macro variable,;

    %do k=1 %to %sysfunc(countw(&statesList)); *e.g. acute_1 asymptomatic_1 ...;
        %let state = %scan(&statesList, &k);
        put "=== &k &state";
        &state = 1; * multiplication of terms ;

        * more code here;
    %end;


run;
%mend dummy;
%dummy;

EDIT3:
The input dataset looks like this

Germ    AgeGroup1 AgeGroup2 Gender Cases    Year
V_HBV   15-19   15-19   M   12  2015
V_HBV   15-19   15-19   M   8   2016
V_HBV   20-24   20-24   F   37  2011
V_HBV   20-24   20-24   F   46  2012
V_HBV   20-24   20-24   F   66  2013

The output dataset will add variables contained in the string defined by the macro variable which depends on the Germ.

e.g. for V_HBV it will create these variables: asymptomatic_1 fulminant_1 chronic_1 chronic_1 fatalFulminant_1 hepatocellular_1 compensated_1 hepatocellular_2 decompensated_1 fatalHepatocellular_1 fatalHepatocellular_2 fatalDecompensated_1

1
How many observations has your dataset? You could solve this with two datasteps, building a macrovariable for every observation in first datastep, and then iterating over all macrovariables in a macroloop in second dataset, but this only makes sense when your data is not to big. Also i guess this is only a simplified example, because otherwise you could solve this without a macrovariable?kl78
Can you provide example "have" and "want" datasets? I'm guessing you have a wide dataset with many states in each of those two rows?Sean
Updated the question with more code and perspective.stallingOne
there are about 4000 observations (in DALY_agregate), with ~40 unique germs and about ~10 states per germ.stallingOne
Still hard to know what you're after.. Could you pls clarify what your inputs and ouputs are (what is already in your input table, what comes from pre-existing macro variables if any, and what your output dataset should look like)?Dominic Comtois

1 Answers

0
votes

I'm not following the big picture, but one of the previous iterations of your question had some code (pseudo code) that illustrates possible confusion about how the macro language works. Consider this step:

data _null_;
  germ="Syph";  
  call symput('germ',germ);
  %let Germ=%sysfunc(cats(germ));
  put "germ = &germ";
run;
%put &germ;

The log from executing that in a fresh SAS session shows:

1    data _null_;
2      germ="Syph";
3      call symput('germ',germ);
4      %let Germ=%sysfunc(cats(germ));
5      put "germ = &germ";
6    run;

germ = germ

7    %put &germ;
Syph

Now let's talk about what's happening. I'll use the line numbers from the log.

Line 2 assigns text string Syph to data step variable germ. Nothing special.

Line 3 creates a macro variable named Germ, and assigns in the value of the datastep variable germ. So it assigns it the value Syph. This CALL SYMPUT statement executes when the data step executes.

Line 4 is a macro %let statement. It creates a macro variable named Germ, and assigns it the value germ. Because this is a macro statement, it executes before any of the DATA STEP code has executed. It does not know about data step variables. Line 4 is equivalent to %let Germ=germ. To the macro language, the right hand side is just a four-character string germ. It is not the name of a data step variable. %syfunc(cats()) is doing nothing, because there is no list of items to concatenate.

Line 5 is a data step PUT statement. The macro reference &germ is resolved while the data step is compiling. At this point the macro variable germ resolves to Germ because the %LET statement has executed (the CALL SYMPUT statement has not executed yet).

Line 7 is a %PUT statement that executes after the DATA NULL step has completed (and after the CALL SYMPUT has written the value Syph to macro variable Germ).

As a general principle, it is difficult (and unusual) to have a single data step in which you are using data to create a macro variable (e.g. via call symput) and using that macro variable in the same step (i.e. referencing the macro variable). Macro references are resolved before any of the data step code executes.

Typically if your data are already in a dataset, you can get what you want with data step statements (DO loops rather than %DO loops, etc). Or alternatively you can use one DATA step to generate your macro variables, and a second DATA step can reference them.

Hope that helps.