3
votes

I am very familiar with the SAS programing environment. I am currently trying to learn to program in R. I have found that using SAS Macros reduces the amount of repetitive code in my programming. Particularly, I have found useful adjusting parts of datasets names and variable names using macro variables. However, in R I haven't found something that can replicate this.

For example, in SAS I could write a simple macro to run proc means on two datasets like this:

%macro means(dataset_suffix = , var1_suffix= );
proc means data = data&dataset_suffix;
var var1&var1_suffix;
run;
%mend means;
%means(dataset_suffix = _suf1, var1_suffix = _suf2);
%means(dataset_suffix = _suf3, var1_suffix = _suf4);

running this code executes the macro two times resulting in the following code being run

proc means data = data_suf1;
var var_suf2;
run;
proc means data = data_suf3;
var var_suf4;
run;

I have looked into R's user defined functions as well as using lists. I know there isn't a procedure in R that is directly comparable to proc means. However, this focus of my question is how to use macro variables to reference different objects in R that have similar prefixes but different suffixes. I have also considered using the paste function. Any help with this would be most appreciated.

1
I would take a look at the names attribute for a dataframe - if you do something like names(data)<-paste0("suffix",names(data2)) it should achieve the sort of thing you're after.Steph Locke
Thanks you for this thought. I'll look into it.user1500158

1 Answers

8
votes

It always takes some adjustment coming from a macro-heavy language (SAS or Stata) to one that has real variables (R). In the end, you'll find that real variables are more powerful and less error-prone.

Just about everything in R is a first-class object. And a list can store just about any object. That means you can have lists of model objects, data.frames, whatever you want.

datasets <- list( one=data.frame(x=runif(100),y=runif(100) ), two=data.frame(x=runif(100),y=runif(100) ) )
lm(y~x, data=datasets$one)
modelList <- lapply( datasets, function(dat) lm(y~x, data=dat) ) 

Returns a list of model results:

> modelList
$one

Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)            x  
    0.46483      0.06038  


$two

Call:
lm(formula = y ~ x, data = dat)

Coefficients:
(Intercept)            x  
    0.48379      0.00948  

Which you can then operate on:

sapply(modelList,coef)
                   one         two
(Intercept) 0.46482610 0.483785135
x           0.06038169 0.009480099

Starting to see the power yet? :-)

You could do the same thing with loops, but *apply commands save you a lot of book-keeping code.