1
votes

I have two tables on SAS "Table_A" and "Table_A_Archive", as part my ETL process "Table_A" is created on a daily basis and the data should be archived on "Table_A_Archive". When the data is archived on "Table_A_Archive" a a flag will be created/updated "DT_FLAG".

On the first day this is how the table would look like

 "Table_A"
 | ID  | Load_Date
 ------ -------------
 | 100 | 01JUN2020:12:13:56

 "Table_A_Archive"
  | ID  | Load_Date           | DT_FLAG
  ------ --------------------- ---------
  | 100 | 01JUN2020:12:13:56  | 1

On Day 2

 "Table_A"
 | ID  | Load_Date
 ------ ------------
 | 101 | 02JUN2020:12:13:56

 "Table_A_Archive"
 | ID  | Load_Date           | DT_FLAG
 ------ --------------------- ---------
 | 100 | 01JUN2020:12:13:56  | 2
 | 101 | 02JUN2020:12:13:56  | 1

The new data should be loaded with the DT_FLAG of 1 and old records DT_FLAG should be incremented by 1. Load_Date is the pivotal point. I have written a SAS code but it seems a bit messy, can someone please help me with a SAS Datastep

 %macro Cntl_archive(table_name=,arch_table_name=);
 %GLOBAL WRK;

 %if %sysfunc(exist(&arch_table_name.))  %then %do;

 proc append base=&arch_table_name. data=&table_name. force;
 run;

 proc sql;
 Create table TEMP as 
 Select distinct Load_Date,Load_Date as WRK from &arch_table_name.
 order by Load_Date desc
 ;quit;

 proc rank data=TEMP descending out=TEMP;
 var WRK;
 ranks count;
 run;

 data &arch_table_name. (drop=DT_FLAG);
 set    &arch_table_name.;
 run;

 proc sql;
 Create table &arch_table_name. as
 Select T0.*,T1.count as DT_FLAG from &arch_table_name. T0
 inner join TEMP T1 on T0.Load_Date=T1.Load_Date
 ;quit
 %end;
 %else %do;

data &arch_table_name.;
set &table_name.;
DT_FLAG= 1; 
IS_ACTIVE='';
run;
%end;
%mend Cntl_archive;
5
Should flag increment only when Load_date is repeated?Parfait

5 Answers

1
votes

I think do the same thing by a merge statement:

%if %sysfunc(exist(Table_A_Archive)) = 0 %then %do;
  data Table_A_Archive;
    set Table_A;
  run;
%end;

data Table_A_Archive;
  merge Table_A_Archive(in=ALL) Table_A;
  by ID;

  if ALL then DT_FLAG = sum(DT_FLAG,1);
  else DT_FLAG = 1;
run;

Consider of you may want this daily work be as quick as possibly, I suggest to use update or modify statement to replace merge:

%if %sysfunc(exist(Table_A_Archive)) = 0 %then %do;
  data Table_A_Archive;
    set Table_A;
  run;
%end;

data Table_A_Archive;
  update Table_A_Archive Table_A;
  by ID;

  if _iorc_ = %sysrc(_sok) then DT_FLAG = sum(DT_FLAG,1);
run;

It is more efficient because it can update(or modify) data without creating a copy of the data set.

1
votes

Here is a method using the MODIFY statement to update the values of DT_FLAG in the existing observations in place and append the new values.

First lets make the initial A and use it to create an empty A_ARCHIVE with the extra variable. (Note I renamed your timestamp variable to avoid the confusion caused by having a variable named "date" that has datetime values instead of date values.)

data a ;
  input id load_dt :datetime.;
  format load_dt datetime19.;
cards;
100 01JUN2020:12:13:56
;

data a_archive;
  stop;
  set a ;
  dt_flag=0;
run;

Now let's append A to A_ARCHIVE.

data a_archive;
  do while(not eof1);
    modify a_archive end=eof1;
    dt_flag=sum(dt_flag,1);
    replace;
  end;
  do until(eof2);
    set a end=eof2;
    dt_flag=1;
    output;
  end;
run;

Now you can make a new version of A and re-run the same data step to append it.

data a ;
  input id load_dt :datetime.;
  format load_dt datetime19.;
cards;
101 02JUN2020:12:13:56
;

data a_archive;
  do while(not eof1);
    modify a_archive end=eof1;
    dt_flag=sum(dt_flag,1);
    replace;
  end;
  do until(eof2);
    set a end=eof2;
    dt_flag=1;
    output;
  end;
run;

Results:

Obs     id                load_dt  dt_flag

 1     100     01JUN2020:12:13:56       2
 2     101     02JUN2020:12:13:56       1
1
votes

Use Proc APPEND and compute DT_FLAG on the fly when needed. No need to mess with the archive other than add records to it.

On-the-fly would be DATA step view.

Example:

The example want data sets are in WORK. but would be some PERM. in your real word case.

* simulate a clean start and some ETL activity with APPEND archiving;

proc delete data=want;
proc delete data=want_archive;

* DAY 1,  load #1;

data DAILY_ETL;
  ID = 100;   load_date = today()-100;   format load_date yymmdd10.;
run;
data want; 
  set DAILY_ETL; 
run;
proc append base=want_archive data=want;
run;

* DAY 2, load #2;

data DAILY_ETL;
  ID = 100;   load_date = today()-99;   format load_date yymmdd10.;
run;
data want; 
  set DAILY_ETL; 
run;
proc append base=want_archive data=want;
run;

* DAY 4, load #3;
data DAILY_ETL;
  ID = 100;   load_date = today()-97;   format load_date yymmdd10.;
run;
data want; 
  set DAILY_ETL; 
run;
proc append base=want_archive data=want;
run;

and viewing

* view for on-the-fly DT_FLAG (do once);

data want_archive_v;
  set want_archive nobs=N;
  dt_flag = N - _N_ + 1;
run;

dm 'viewtable want_archive_v';

enter image description here

0
votes

Consider again proc sql with a count correlated subquery. Unfortunately, SAS does not allow updating a table with values to itself, so a temp table copy is used. Below assumes ID is incremented with each day.

proc sql;
    insert into Table_A_Archive (ID, Load_Date)
    select ID, Load_Date
    from Table_A;

    create table temp as
    select ID, Load_Date from Table_A_Archive;

    update Table_A_Archive t
    set DT_Flag = (select count(*) 
                   from temp sub
                   where t.ID <= sub.ID 
                   and   t.Load_Date = sub.Load_Date);

    drop table temp;
quit;
0
votes

I tried solving it by this method.

%macro Cntl_archive(table_name=,arch_table_name=);

%if %sysfunc(exist(&arch_table_name.))  %then %do;

data Data_append;
set &table_name.;
if _n_ = 1
then do;
  set &arch_table_name.(keep=dt_flag) point=nobs nobs=nobs;
dt_flag + 1;
end;
run;

proc append base=&arch_table_name. data=Data_append force;
run;

%end;
%else %do;

data &arch_table_name.;
set &table_name.;
DT_FLAG= 1; 
IS_ACTIVE='';
run;
%end;
%mend Cntl_archive;