In SAS there's an method of creating Library (using LIBNAME). This is helpful as when we have to do long data processing, we don't change always the dataset name. So, if we want to use a dataset again, without changing the name, we can put in a library. So, even if the dataset name are same, but since they are in different libraries, we can work on them together.
My question is there any such option in R that can create Library (or separate folder within R) so that we can save our data there?
Here's the example:
Suppose I've a dataset "dat1". I summarize variables in dat1 var1 & var2 for var3.
proc summary data=dat1 nway missing;
var var1 var2;
class var3;
output out=tmp.dat1 (drop = _freq_ _type_) sum = ;
run;
Then I merged dat1 with dat2, which is another dataset.Both dat1 & dat2 has common variable var3, with which I merged. I created new dataset dat1 again.
proc sql;
create table dat1 as
select a.*,b.*
from dat1 a left join tmp.dat2 b
on a.var3=b.var3;
quit;
Now, I'm again summarizing dataset dat1 after merging to check if the values of var1 & var 2 remain the same before & after merging.
proc summary data=dat1 nway missing;
var var1 var2;
class var3;
output out=tmp1.dat1 (drop = _freq_ _type_) sum = ;
run;
The equivalent code in R will be
dat3<-ddply(dat1,.(var3),summarise,var1=sum(var1,na.rm=TRUE),var2=sum(var2,na.rm=TRUE))
dat1<-sqldf("select a.*,b.* from dat1 a left join dat2 b on a.var3=b.var3")
dat4<-ddply(dat1,.(var3),summarise,var1=sum(var1,na.rm=TRUE),var2=sum(var2,na.rm=TRUE))
In case of SAS I used just 2 dataset name. But in case of R, I'm using 4 dataset name. So, if I'm writing 4000 line code for data processing, having too many dataset name sometimes become overwhelming. In sas it became easy to have same dataset name as I'm using 2 libraries tmp, tmp1 other than the default work library.
In SAS, library is defined as:
LIBNAME tmp "directory_path\folder_name";
In this folder, dat1 will be stored.
?save
(?load
to load them). – sgibb